
Kilian Lieret
@KLieret
Followers
876
Following
37
Media
27
Statuses
105
Research Software Engineer at Princeton University. AI agents & benchmarks for software engineering.
Princeton
Joined May 2021
Releasing mini, a radically simple SWE-agent: 100 lines of code, 0 special tools, and gets 65% on SWE-bench verified!.Made for benchmarking, fine-tuning, RL, or just for use from your terminal. It’s open source, simple to hack, and compatible with any LM! Link in 🧵
12
73
791
You can find lots of other models evaluated under the same settings at (bash-only leaderboard). You can find our agent implementation at
github.com
The 100 line AI agent that solves GitHub issues or helps you in your command line. Radically simple, no huge configs, no giant monorepo—but scores 68% on SWE-bench verified! - SWE-agent/mini-swe-agent
0
0
7
This is evaluated with mini-swe-agent (common-sense prompts, no tools other than bash, some 100 lines of code for the agent class): We're still working on evaluating some other open source models (including GLM).
github.com
The 100 line AI agent that solves GitHub issues or helps you in your command line. Radically simple, no huge configs, no giant monorepo—but scores 68% on SWE-bench verified! - SWE-agent/mini-swe-agent
1
1
10
RT @richardcsuwandi: Introducing OpenEvolve x AlgoTune! . Now you can run and benchmark evolutionary coding agents on 100+ algorithm optim….
0
20
0
RT @_carlosejimenez: Recent open model scores on SWE-bench Bash Only:.🥇Qwen3-Coder 480B/A35B Instruct - 55.40%.🥈Kimi-K2-Instruct - 43.80%.🥉….
0
27
0
RT @SemiAnalysis_: At the end of the day, the SWE-bench leaderboard on swebench dot com is probably the most clear description of current m….
0
15
0
Evaluated with our open source minimal agent that tests LMs in a bare-bones shell environment. Agent is implemented in just some 100 lines! We'll add the results to our swe-bench (bash-only) leaderboard shortly:
github.com
The 100 line AI agent that solves GitHub issues or helps you in your command line. Radically simple, no huge configs, no giant monorepo—but scores 68% on SWE-bench verified! - SWE-agent/mini-swe-agent
0
0
2
More results in the morning! Run the agent yourself:
github.com
The 100 line AI agent that solves GitHub issues or helps you in your command line. Radically simple, no huge configs, no giant monorepo—but scores 68% on SWE-bench verified! - SWE-agent/mini-swe-agent
0
0
2