
Kilian Lieret
@KLieret
Followers
522
Following
21
Media
14
Statuses
61
Research Software Engineer at Princeton University. AI agents & benchmarks for software engineering.
Princeton
Joined May 2021
RT @ori_press: Do language models have algorithmic creativity?. To find out, we built AlgoTune, a benchmark challenging agents to optimizeโฆ.
0
51
0
RT @SWEbench: We just updated the SWE-bench Multimodal leaderboard with new systems from @refact_ai, @allhands_ai and @TU_Muenchen. Congratโฆ.
0
5
0
RT @jyangballin: 40% with just 1 try per task: SWE-agent-LM-32B is the new #1 open source model on SWE-bench Verified. We built it by syntโฆ.
0
131
0
Had a great time talking about building agents, SWE-agent, SWE-bench, and more.
๐ฃ ๐ก๐๐ช #๐๐ฎ๐๐ฎ๐๐ฟ๐ฒ๐ ๐๐ฝ๐ถ๐๐ผ๐ฑ๐ฒ!. In this episode, Kilian Lieret (Research Software Engineer) & Carlos Jimenez (Computer Science PhD Candidate) at @Princeton dive into SWE-bench & SWE-agent, two cutting-edge tools for evaluating & enhancing AI in software engineering.
0
1
4
Evaluating SWE-agent on SWE-bench lite was once an overnight job. With SWE-ReX parallelizing our execution it now takes half an hour! SWE-ReX spins up docker containers with a @FastAPI server that uses pexpect to interface with shell sessions. MIT licensed, lightweight & hackable
1
3
21
RT @OfirPress: The creators of LiveCodeBench just released a new, private, SWE-bench like benchmark in Java, C++, Python, JavaScript, TypeSโฆ.
0
4
0
Join @_carlosejimenez and me today at GenAI Collective NYC as we break down SWE-Bench, SWE-agent, and the future of AI-driven software engineering. What works? Whatโs next? What does this mean for developers? Let's discuss!. ๐ Today 1โ4pm, Brooklyn Navy Yard.
AI coding tools are moving from autocomplete to autonomy ๐ค โ with big implications for developers, users, and businesses ๐ผ. Join GenAI Collective NYC this Thursday, April 3 at Brooklyn Navy Yard Bldg 303 for a panel + fireside chat featuring:.๐ง Carlos Jimenez & Kilian Lieret
0
0
2
RT @OfirPress: We just updated the SWE-bench Multimodal leaderboard. Congrats to Globant, Zencoder, and the Agentless team from UIUC for thโฆ.
0
5
0
RT @daytonaio: Watch Princeton's SWE-agent @KLieret reveal research on autonomous coding agents at Daytona AI Builders @github HQ! From benโฆ.
0
8
0
SWE-agent 1.0 is so much more flexible than before. It has never been easier to set it up with various tool bundles or multiple LMs. And you can combine them all in a multi-attempt scheme!.
SWE-agent 1.0 lets you run multiple attempts with different models or tools on the same task. Use a review agent to select the best from these diverse runs to improve overall performance!
0
0
5