sivil_taram Profile Banner
Qian Liu Profile
Qian Liu

@sivil_taram

Followers
4K
Following
3K
Media
124
Statuses
1K

Researcher @ TikTok 🇸🇬 📄 Sailor / StarCoder / OpenCoder 💼 Past: Research Scientist @SeaAIL; PhD @MSFTResearch 🧠 Contribution: @XlangNLP @BigCodeProject

Joined November 2021
Don't wanna be here? Send us removal request.
@sivil_taram
Qian Liu
7 days
🔥 LLMs can fix bugs, but can they make your code faster? We put them to the test on real-world repositories, and the results are in!. 🚀 New paper: "SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories?". Key findings:.1️⃣ We introduce SWE-Perf, the
Tweet media one
1
16
60
@sivil_taram
Qian Liu
14 hours
Wrapped up a SWE-Perf website redesign using Qwen3-Coder on AnyCoder (. The process was incredibly fast and great!. One question for Qwen devs, though: did you pretrain a secret love for the color purple into the coder's persona? 😉
Tweet media one
0
13
76
@sivil_taram
Qian Liu
24 hours
RT @yusufma555: 🚀🚀🚀 Ever wondered what it takes for robots to handle real-world household tasks? long-horizon execution, deformable object….
0
88
0
@sivil_taram
Qian Liu
24 hours
The most rewarding moment in research: hearing someone say "This actually works in our scenario!" ✨
Tweet media one
3
2
61
@sivil_taram
Qian Liu
1 day
RT @FaZhou_998: Apart from the performance, it’s pure entertainment just watching Qwen3‑Coder build Qwen Code all by itself. Agentic coding….
0
10
0
@sivil_taram
Qian Liu
1 day
RT @Alibaba_Qwen: >>> Qwen3-Coder is here! ✅. We’re releasing Qwen3-Coder-480B-A35B-Instruct, our most powerful open agentic code model to….
0
1K
0
@sivil_taram
Qian Liu
1 day
RT @huybery: After three intense months of hard work with the team, we made it! We hope this release can help drive the progress of Coding….
0
81
0
@sivil_taram
Qian Liu
2 days
RT @Marktechpost: TikTok Researchers Introduce SWE-Perf: The First Benchmark for Repository-Level Code Performance Optimization. SWE-Perf,….
0
10
0
@sivil_taram
Qian Liu
5 days
Tweet media one
0
4
0
@sivil_taram
Qian Liu
7 days
RT @_akhaliq: SWE-Perf. Can Language Models Optimize Code Performance on Real-World Repositories?
Tweet media one
0
23
0
@sivil_taram
Qian Liu
7 days
Work together with Xinyi He, @spirit__song , Lin Yan, Zhijie Fan, @yeeelow233 , Zejian Yuan, and Zejun Ma!
Tweet media one
0
0
2
@sivil_taram
Qian Liu
7 days
Key Takeaways & The Future 💡. Repo-level performance optimization is a frontier challenge for LLMs. It's much harder than bug fixing. There's a significant gap between current SOTA agents and human experts. SWE-Perf provides the first realistic playground to close this gap. We
Tweet media one
1
0
2
@sivil_taram
Qian Liu
7 days
Analysis: How do LLMs vs. Experts Optimize? 🧠. We analyzed the types of changes made. The difference is stark:.🤖 LLMs focus on low-level infrastructure & boilerplate. Think environment setup, dependency handling, and import logic. (e.g., miniconda3, importlib).👩‍💻 Experts target
Tweet media one
1
0
1
@sivil_taram
Qian Liu
7 days
The Results: A Reality Check 📉. So, how did the models do?.🧑‍💻Human Experts: Achieved an average performance gain of 10.9%. 🤖Best LLM Agent (OpenHands w/ Claude-3.7-sonnet): Reached only 2.3%. This highlights a major gap. Today's agents struggle with the reasoning, planning,
Tweet media one
1
0
1
@sivil_taram
Qian Liu
7 days
How We Built It: A Rigorous Pipeline ⚙️. Finding true performance improvements is noisy. We built a 5-stage pipeline to ensure data quality:.1️⃣ Crawled 100k+ PRs from popular GitHub repos. 2️⃣ Ran tests on both original & patched code to measure runtime changes. 3️⃣ Filtered for
Tweet media one
1
0
1
@sivil_taram
Qian Liu
7 days
SWE-Perf: The Benchmark 🏋️‍♂️. To answer our question, we built SWE-Perf. It's not about simple algorithmic puzzles. It’s the real deal:.* 140 tasks from performance-improving Pull Requests on popular repos like scikit-learn, xarray, and pandas. * Each task includes the entire
Tweet media one
1
0
1
@sivil_taram
Qian Liu
7 days
The Problem: Beyond Bug Fixing 🐛. 🧵 We've seen LLMs tackle bug fixing in benchmarks like SWE-Bench. But in the real world, performance is king. 👑 Optimizing code speed is a much harder, open-ended task that requires deep understanding of the entire codebase. Existing
Tweet media one
1
0
1
@sivil_taram
Qian Liu
8 days
RT @ikekong: What happend after Dream 7B?. First, Dream-Coder 7B: A fully open diffusion LLM for code delivering strong performance, traine….
0
32
0
@sivil_taram
Qian Liu
8 days
RT @_zhihuixie: 🚀 Thrilled to announce Dream-Coder 7B — the most powerful open diffusion code  LLM to date.
Tweet media one
0
33
0