Xinyi Wang
@XinyiWang98
Followers
1K
Following
1K
Media
15
Statuses
144
Postdoc @PrincetonPLI. PhD @ucsbNLP. She/her.
Princeton, NJ
Joined November 2020
Happy to share our new preprint for my MIT-IBM internship project! https://t.co/EBDWBaDOIb In a controlled synthetic pretraining setup, we uncover a surprising twist in scaling laws: Bigger models can hurt reasoning.
2
15
125
We're looking for 2 interns for Summer 2026 at the MIT-IBM Watson AI Lab Foundation Models Team. Work on RL environments, enterprise benchmarks, model architecture, efficient training and finetuning, and more! Apply here:
9
52
459
This fall, we're welcoming 8 new postdocs! From reinforcement learning to human-AI collaboration, their work will power forward our initiatives. Meet them and learn more about their research: https://t.co/tfZwpgj72i
3
13
131
✈️Currently attending ACL #ACL2025 in Vienna, Austria. Will present at In-Person at Hall 4/5 (July 30, 10:30 - 12:00): 🚩Gödel Agent: A Self-Referential Agent Framework for Recursively Self-Improvement Come and say hi!
4
5
21
Thrilled Gödel Agent got noticed by Sakana AI & excited for their expansion! Accepted at ACL 2025, our 1st fully self-referential agent can read & modify its entire logic (even that logic). Done via recursion. Paper:
arxiv.org
The rapid advancement of large language models (LLMs) has significantly enhanced the capabilities of AI-driven agents across various tasks. However, existing agentic systems, whether based on...
Introducing The Darwin Gödel Machine: AI that improves itself by rewriting its own code https://t.co/tBzlhoUMZO The Darwin Gödel Machine (DGM) is a self-improving agent that can modify its own code. Inspired by evolution, we maintain an expanding lineage of agent variants,
2
5
23
Disentangling Memory and Reasoning in LLMs (ACL 2025 Main) We propose a new inference paradigm that separates memory from reasoning in LLMs using two simple tokens: ⟨memory⟩ and ⟨reason⟩. ✅ Improves accuracy ✅ Enhances interpretability 📄 Read: https://t.co/v3wBFw1ZEA
#LLM
7
8
19
I'm presenting our mem v.s. gen paper at ICLR on Saturday morning at #244. Come and check it out if you're interested!
🚀 New preprint 🚀 1/9🌟 How do large language models (LLMs) tradeoff memorization vs. generalization? Our new preprint explores this fundamental question from a data distribution perspective. Co-lead with @anton_iades
https://t.co/ZYpFezjXYX 🧵👇
0
2
9
I’m attending #ICLR in Singapore! Also excited to share that I’m joining the Princeton Language and Intelligence Lab as a postdoc in July. In Fall 2026, I’ll be starting as an Assistant Professor at the University at Buffalo. I’ll be recruiting—feel free to reach out and chat!
23
12
162
A contrarian result I like a lot: smaller language models perform better on knowledge graphs than larger ones, as "overparameterization can impair reasoning due to excessive memorization".
25
81
817
Do Larger Language Models Imply Better Reasoning? A Pretraining Scaling Law for Reasoning LLMs trained on synthetic multihop graphs show a U-shaped curve in reasoning: too small underfit, too large overfit. Overparameterization hurts edge completion due to memorization. A linear
5
20
94
Do Larger Language Models Imply Better Reasoning? A Pretraining Scaling Law for Reasoning On a synthetic multihop reasoning environment designed to closely replicate the structure and distribution of real-world large-scale knowledge graphs, the authors observe that
8
48
321
This project has taken a bit too long due to my job search. Many thanks to the support of my amazing collaborators @tanshawn @fnruji316625 @WilliamWangNLP @rpanda89 @Yikang_Shen !
0
0
3
We systematically vary graph size, structure, and training steps. Key findings: 1. Each graph has an optimal model size. 2. This size is stable across (large) training steps. 3. Optimal model size grows linearly with graph search entropy.
1
0
4
Our task: complete missing edges of the knowledge graph that the language model is pretrained on—a proxy for (implicit) multi-hop reasoning natually emerge from connected knowledge. We find that the reasoning loss follows a U-shaped curve as model size increases.
1
0
4
We pretrain LLMs from scratch on triples of synthetic knowledge graphs designed to mimic the real-world knowledge distribution and the multi-hop reasoning structure. This gives us a clean testbed to study reasoning.
1
0
3
🚀Thrilled to introduce ☕️MoCha: Towards Movie-Grade Talking Character Synthesis Please unmute to hear the demo audio. ✨We defined a novel task: Talking Characters, which aims to generate character animations directly from Natural Language and Speech input. ✨We propose
18
58
220
🚨Hiring🚨 We are looking for research scientists and engineers to join IBM Research (Cambridge, Bangalore). We train large language models and do fundamental research on directions related to LLMs. Please DM me your CV and a brief introduction of yourself if you are interested!
11
55
617
🙌We are calling for submissions and recruiting reviewers for the Open Science for Foundation Models (SCI-FM) workshop at ICLR 2025! Submit your paper: https://t.co/mqFWsfLVKU (deadline: Feb 13) Register as a reviewer: https://t.co/QkuR3v8uf1 (review submission deadline: Feb 28)
forms.office.com
We're excited that you're interested in being a reviewer for our ICLR workshop on "Open Science for Foundation Models"! Your expertise and insights would be invaluable in shaping the discussions at...
🎉 Announcing the first Open Science for Foundation Models (SCI-FM) Workshop at #ICLR2025! Join us in advancing transparency and reproducibility in AI through open foundation models. 🤝 Looking to contribute? Join our Program Committee: https://t.co/U9eIGY0Qai 🔍 Learn more at:
1
4
18
🎉 Announcing the first Open Science for Foundation Models (SCI-FM) Workshop at #ICLR2025! Join us in advancing transparency and reproducibility in AI through open foundation models. 🤝 Looking to contribute? Join our Program Committee: https://t.co/U9eIGY0Qai 🔍 Learn more at:
6
44
175
We have few intern positions open in our ML team @ MSR Montreal, come work with @Cote_Marc @kim__minseon @LucasPCaccia @mathe_per @ericxyuan on reasoning, interactive envs/coding and LLM modularization.. 🤯 @mathe_per and I will also be at #NeurIPS2024 so we can chat about this
0
9
58