Eric Xingdi Yuan Profile
Eric Xingdi Yuan

@ericxyuan

Followers
915
Following
1K
Media
40
Statuses
384

Senior Researcher at Microsoft Research, Montreal. Opinions are my own.

Montréal, Québec
Joined August 2016
Don't wanna be here? Send us removal request.
@ericxyuan
Eric Xingdi Yuan
20 days
We want to push towards agents that understand a repo on the codebase-level, this requires tasks beyond looking at just a few lines of the code or a single file. In this work led by our great intern Amy Lee, we explore how such tasks should look like.
@hyunji_amy_lee
hyunji amy lee
20 days
🚨 Excited to announce Gistify!, where a coding agent must extract the gist of a repository: generate a single, executable, and self-contained file that faithfully reproduces the behavior of a given command (e.g., a test or entrypoint). ✅ It is a lightweight, broadly applicable
0
4
12
@murefil
Alessandro Sordoni
1 month
This was a great group effort ❤️. Check the thread below! My 2c: we train a 32B coding agent by distilling a strong teacher model on a mix of real and synthetic bugs generated by our new approach BugPilot 🛩️! BugPilot creates bugs unintentionally, by asking the teacher to
@isadorcw
Isadora White
1 month
Excited to introduce our SoTA coding models, FrogBoss (32B) and FrogMini (14B), on SWE-Bench-Verified! (FrogBoss eats bugs… like a boss) 🐸🪲 These models were trained with bugs from a mix of existing and our new synthetic bug generation approach, called BugPilot. (1/n)
0
8
38
@ericxyuan
Eric Xingdi Yuan
1 month
Generate better bugs by avoid asking your agent to generate bugs! Great work led by @isadorcw and @twm_as !
@isadorcw
Isadora White
1 month
Excited to introduce our SoTA coding models, FrogBoss (32B) and FrogMini (14B), on SWE-Bench-Verified! (FrogBoss eats bugs… like a boss) 🐸🪲 These models were trained with bugs from a mix of existing and our new synthetic bug generation approach, called BugPilot. (1/n)
0
0
6
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
4 months
By popular demand we've extended the Wordplay Workshop deadline by a couple of weeks until Sept 12! The competition on realistic dialogue for game agents already has over 5000 submissions and the winners will also be at the workshop. Come hang out with us at EMNLP!
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
7 months
The Wordplay Workshop is back! 5th edition with EMNLP in Suzhou this Dec. We're also hosting a competition this time on making more realistic LLM powered NPCs in games! As always come by and chat all things text agents!
2
8
16
@ericxyuan
Eric Xingdi Yuan
4 months
Great work! Congratulations!
@frankniujc
Jingcheng (Frank) Niu
4 months
Hey this is me! Our paper: Llama See, Llama Do: A Mechanistic Perspective on Contextual Entrainment and Distraction in LLMs Blog post:
0
0
5
@frankniujc
Jingcheng (Frank) Niu
4 months
📢 Next week, I will be presenting our paper "Llama See, Llama Do: A Mechanistic Perspective on Contextual Entrainment and Distraction in LLMs" at ACL 2025! Paper: https://t.co/usPw15Woke Blog Post: https://t.co/9RRzaEz9m9 Talk: https://t.co/GiPHfOhzx8
1
2
12
@LucasPCaccia
Lucas Caccia
5 months
RAG and in-context learning are the go-to approaches for integrating new knowledge into LLMs, making inference very inefficient We propose instead 𝗞𝗻𝗼𝘄𝗹𝗲𝗱𝗴𝗲 𝗠𝗼𝗱𝘂𝗹𝗲𝘀 : lightweight LoRA modules trained offline that can match RAG performance without the drawbacks
1
14
44
@ericxyuan
Eric Xingdi Yuan
5 months
CFP of the Wordplay 2025 (EMNLP) is live! https://t.co/blqp5JQ1us
@ericxyuan
Eric Xingdi Yuan
7 months
Announcing the 5th Wordplay Workshop at EMNLP 2025 (Suzhou, China). We are co-organizing the CPDC Challenge (total prize value USD 20K!!!), the warm-up round is starting now!
0
6
17
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
7 months
Introducing TALES - Text Adventure Learning Environment Suite A benchmark of a few hundred text envs: science experiments and embodied cooking to solving murder mysteries. We test over 30 of the best LLM agents and pinpoint failure modes +how to improve 👨‍💻pip install tale-suite
2
19
66
@ericxyuan
Eric Xingdi Yuan
7 months
Announcing the 5th Wordplay Workshop at EMNLP 2025 (Suzhou, China). We are co-organizing the CPDC Challenge (total prize value USD 20K!!!), the warm-up round is starting now!
wordplay-workshop.github.io
Official website for the Wordplay Workshop at EMNLP 2025. Exploring interactive narratives, text-adventure games, and AI agents in language-based environments. Join us in Suzhou, China, November...
@aicrowdHQ
AIcrowd
7 months
🎮 You're exploring your favourite RPG city. The blacksmith greets you, remembers you saved his life recommends a customised weapon upgrade. Build better NPCs that respond naturally, adapt dynamically, and recall your actions.👇 https://t.co/Hdgs2IJkPC
1
1
6
@ericxyuan
Eric Xingdi Yuan
8 months
Super excited to share this. The project page, the technical report, and the open-sourced github repo can be found at
microsoft.github.io
@MSFTResearch
Microsoft Research
8 months
Developers spend a lot of time debugging code. Learn how debug-gym can equip AI agents to help, enabling them to set breakpoints, navigate the codebase, and print runtime variable values on demand, so they better understand the code and its execution flow: https://t.co/TFHncIElTZ
0
5
23
@allen_ai
Ai2
8 months
Imagine AI doing science: reading papers, generating ideas, designing and running experiments, analyzing results… How many more discoveries can we reveal? 🧐 Meet CodeScientist, a promising next step toward autonomous scientific discovery. 🧵
6
97
369
@pyautogen
AutoGen
10 months
HUGE: The biggest upgrade to #AutoGen just dropped! v0.4 (stable) is finally here. For detail, checkout the blog below.
@MSFTResearch
Microsoft Research
10 months
Announcing AutoGen 0.4, fully reimagined library for building advanced agentic AI systems, developed to improve code quality and robustness. Its asynchronous, event-driven architecture is designed to support dynamic, scalable workflows. Learn more: https://t.co/N7iSeR7ZJk
5
33
109
@murefil
Alessandro Sordoni
1 year
The ML team at @MSFTResearch Montréal 🍁 is hiring a Senior Researcher with a background in ML / NLP!!! Come work with us at the intersection of interactivity, modularity and reasoning in foundation models 😊 MSR is a highly collaborative environment where risky ideas are
1
37
128
@peterjansen_ai
Peter Jansen ( @peterjansen-ai.bsky.social )
1 year
My student Ruoyao Wang's ACL 2024 paper is featured in the State of AI report. He's on the job market this year, and one of the most experienced NLP+Simulation PhD students out there. You should hire him! Ruoyao's Website: https://t.co/mo0eRfogSq Paper:
@nathanbenaich
Nathan Benaich
1 year
🪩The @stateofaireport 2024 has landed! 🪩 Our seventh installment is our biggest and most comprehensive yet, covering everything you *need* to know about research, industry, safety and politics. As ever, here's my director’s cut (+ video tutorial!) 🧵
1
6
21
@smdvln
Sam Devlin
1 year
Our team @MSFTResearch is hiring for a 2-year AI Residency role in the area of learning to control embodied agents, with the goal of informing future applications in Gaming and Robotics. For more details and to formally apply, please visit:
6
35
166
@pyautogen
AutoGen
1 year
We are excited to announce a preview of the new architecture of AutoGen (coming in v0.4). To learn more, see Blog: https://t.co/fNZhTdxaL0 Pull request: https://t.co/TDi0tudjLq Come help us shape the future of AutoGen!
5
57
179
@ericxyuan
Eric Xingdi Yuan
1 year
Multiple authors (including me) are going to Bangkok, let's chat in person if you are going as well!
@peterjansen_ai
Peter Jansen ( @peterjansen-ai.bsky.social )
1 year
Can language models be used as world simulators? In our ACL 2024 paper, we show -- not really. GPT-4 is only ~60% accurate at simulating state changes based on common-sense tasks, like boiling water. Preprint: https://t.co/WYkTTcu6g7 @allen_ai @MSFTResearch @aclmeeting
0
2
15
@ericxyuan
Eric Xingdi Yuan
1 year
Hello community, we are looking for a few emergency reviewers to help reviewing some papers within 2 days. Please email us at wordplay.workshop.organizers@gmail.com to let us know your OpenReview account if you are willing to help! Thanks!
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
2 years
Wordplay has been by far my favorite workshop on all things language agents, games, and interactive NLP since we started it in 2017. This time we'll be co located with ACL in Bangkok! Call for papers: https://t.co/TFpO8rLYPF
1
8
3
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
2 years
Reminder that there's only a couple more weeks (May 31) until the deadline for the Wordplay: When Language Meets Games workshop at ACL in Bangkok!! Submit all your papers on language agents, simulations, narrative, AI for games, and more!!
1
24
43