Qian Liu @sivil_taram X Profile

Qian Liu

@sivil_taram

Followers

4K

Following

3K

Media

124

Statuses

1K

Researcher @ TikTok 🇸🇬 📄 Sailor / StarCoder / OpenCoder 💼 Past: Research Scientist @SeaAIL; PhD @MSFTResearch 🧠 Contribution: @XlangNLP @BigCodeProject

Joined November 2021

Don't wanna be here? Send us removal request.

Qian Liu

@sivil_taram

7 days

🔥 LLMs can fix bugs, but can they make your code faster? We put them to the test on real-world repositories, and the results are in!. 🚀 New paper: "SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories?". Key findings:.1️⃣ We introduce SWE-Perf, the

1

16

60

Qian Liu

@sivil_taram

14 hours

Wrapped up a SWE-Perf website redesign using Qwen3-Coder on AnyCoder (. The process was incredibly fast and great!. One question for Qwen devs, though: did you pretrain a secret love for the color purple into the coder's persona? 😉

0

13

76

Qian Liu

@sivil_taram

24 hours

RT @yusufma555: 🚀🚀🚀 Ever wondered what it takes for robots to handle real-world household tasks? long-horizon execution, deformable object….

0

88

0

Qian Liu

@sivil_taram

24 hours

The most rewarding moment in research: hearing someone say "This actually works in our scenario!" ✨

3

2

61

Qian Liu

@sivil_taram

1 day

RT @FaZhou_998: Apart from the performance, it’s pure entertainment just watching Qwen3‑Coder build Qwen Code all by itself. Agentic coding….

0

10

0

Qian Liu

@sivil_taram

1 day

RT @Alibaba_Qwen: >>> Qwen3-Coder is here! ✅. We’re releasing Qwen3-Coder-480B-A35B-Instruct, our most powerful open agentic code model to….

0

1K

0

Qian Liu

@sivil_taram

1 day

RT @huybery: After three intense months of hard work with the team, we made it! We hope this release can help drive the progress of Coding….

0

81

0

Qian Liu

@sivil_taram

2 days

RT @Marktechpost: TikTok Researchers Introduce SWE-Perf: The First Benchmark for Repository-Level Code Performance Optimization. SWE-Perf,….

0

10

0

Qian Liu

@sivil_taram

3 days

RT @allhands_ai: Nice new research work by @tiktok_us on benchmarking performance optimization by LLM agents: Open….

arxiv.org

Code performance optimization is paramount in real-world software engineering and critical for production-level systems. While Large Language Models (LLMs) have demonstrated impressive...

0

16

0

Qian Liu

@sivil_taram

5 days

RT @SinclairWang1:

0

4

0

Qian Liu

@sivil_taram

7 days

RT @_akhaliq: SWE-Perf. Can Language Models Optimize Code Performance on Real-World Repositories?

0

23

0

Qian Liu

@sivil_taram

7 days

Work together with Xinyi He, @spirit__song , Lin Yan, Zhijie Fan, @yeeelow233 , Zejian Yuan, and Zejun Ma!

0

2

Qian Liu

@sivil_taram

7 days

Key Takeaways & The Future 💡. Repo-level performance optimization is a frontier challenge for LLMs. It's much harder than bug fixing. There's a significant gap between current SOTA agents and human experts. SWE-Perf provides the first realistic playground to close this gap. We

1

0

2

Qian Liu

@sivil_taram

7 days

Analysis: How do LLMs vs. Experts Optimize? 🧠. We analyzed the types of changes made. The difference is stark:.🤖 LLMs focus on low-level infrastructure & boilerplate. Think environment setup, dependency handling, and import logic. (e.g., miniconda3, importlib).👩‍💻 Experts target

1

0

1

Qian Liu

@sivil_taram

7 days

The Results: A Reality Check 📉. So, how did the models do?.🧑‍💻Human Experts: Achieved an average performance gain of 10.9%. 🤖Best LLM Agent (OpenHands w/ Claude-3.7-sonnet): Reached only 2.3%. This highlights a major gap. Today's agents struggle with the reasoning, planning,

1

0

1

Qian Liu

@sivil_taram

7 days

How We Built It: A Rigorous Pipeline ⚙️. Finding true performance improvements is noisy. We built a 5-stage pipeline to ensure data quality:.1️⃣ Crawled 100k+ PRs from popular GitHub repos. 2️⃣ Ran tests on both original & patched code to measure runtime changes. 3️⃣ Filtered for

1

0

1

Qian Liu

@sivil_taram

7 days

SWE-Perf: The Benchmark 🏋️‍♂️. To answer our question, we built SWE-Perf. It's not about simple algorithmic puzzles. It’s the real deal:.* 140 tasks from performance-improving Pull Requests on popular repos like scikit-learn, xarray, and pandas. * Each task includes the entire

1

0

1

Qian Liu

@sivil_taram

7 days

The Problem: Beyond Bug Fixing 🐛. 🧵 We've seen LLMs tackle bug fixing in benchmarks like SWE-Bench. But in the real world, performance is king. 👑 Optimizing code speed is a much harder, open-ended task that requires deep understanding of the entire codebase. Existing

1

0

1

Qian Liu

@sivil_taram

8 days

RT @ikekong: What happend after Dream 7B?. First, Dream-Coder 7B: A fully open diffusion LLM for code delivering strong performance, traine….

0

32

0

Qian Liu

@sivil_taram

8 days

RT @_zhihuixie: 🚀 Thrilled to announce Dream-Coder 7B — the most powerful open diffusion code  LLM to date.

0

33

0