Xinyi Wang @XinyiWang98 X Profile

Xinyi Wang

@XinyiWang98

Followers

1K

Following

1K

Media

15

Statuses

144

Postdoc @PrincetonPLI. PhD @ucsbNLP. She/her.

https://t.co/J3gjry4NCE

Princeton, NJ

Joined November 2020

Don't wanna be here? Send us removal request.

Xinyi Wang

@XinyiWang98

8 months

Happy to share our new preprint for my MIT-IBM internship project! https://t.co/EBDWBaDOIb In a controlled synthetic pretraining setup, we uncover a surprising twist in scaling laws: Bigger models can hurt reasoning.

2

15

125

Shawn Tan

@tanshawn

2 months

We're looking for 2 interns for Summer 2026 at the MIT-IBM Watson AI Lab Foundation Models Team. Work on RL environments, enterprise benchmarks, model architecture, efficient training and finetuning, and more! Apply here:

9

52

459

Princeton Laboratory for Artificial Intelligence

@PrincetonAInews

3 months

This fall, we're welcoming 8 new postdocs! From reinforcement learning to human-AI collaboration, their work will power forward our initiatives. Meet them and learn more about their research: https://t.co/tfZwpgj72i

3

13

131

Xunjian Yin@NeurIPS

@xunjian_yin

5 months

✈️Currently attending ACL #ACL2025 in Vienna, Austria. Will present at In-Person at Hall 4/5 (July 30, 10:30 - 12:00): 🚩Gödel Agent: A Self-Referential Agent Framework for Recursively Self-Improvement Come and say hi!

4

5

21

Xunjian Yin@NeurIPS

@xunjian_yin

6 months

Thrilled Gödel Agent got noticed by Sakana AI & excited for their expansion! Accepted at ACL 2025, our 1st fully self-referential agent can read & modify its entire logic (even that logic). Done via recursion. Paper:

arxiv.org

The rapid advancement of large language models (LLMs) has significantly enhanced the capabilities of AI-driven agents across various tasks. However, existing agentic systems, whether based on...

Sakana AI

@SakanaAILabs

6 months

Introducing The Darwin Gödel Machine: AI that improves itself by rewriting its own code https://t.co/tBzlhoUMZO The Darwin Gödel Machine (DGM) is a self-improving agent that can modify its own code. Inspired by evolution, we maintain an expanding lineage of agent variants,

2

5

23

Mingyu_Jin19

@fnruji316625

7 months

Disentangling Memory and Reasoning in LLMs (ACL 2025 Main) We propose a new inference paradigm that separates memory from reasoning in LLMs using two simple tokens: ⟨memory⟩ and ⟨reason⟩. ✅ Improves accuracy ✅ Enhances interpretability 📄 Read: https://t.co/v3wBFw1ZEA #LLM

7

8

19

Xinyi Wang

@XinyiWang98

8 months

I'm presenting our mem v.s. gen paper at ICLR on Saturday morning at #244. Come and check it out if you're interested!

Xinyi Wang

@XinyiWang98

1 year

🚀 New preprint 🚀 1/9🌟 How do large language models (LLMs) tradeoff memorization vs. generalization? Our new preprint explores this fundamental question from a data distribution perspective. Co-lead with @anton_iades https://t.co/ZYpFezjXYX 🧵👇

0

2

9

Xinyi Wang

@XinyiWang98

8 months

I’m attending #ICLR in Singapore! Also excited to share that I’m joining the Princeton Language and Intelligence Lab as a postdoc in July. In Fall 2026, I’ll be starting as an Assistant Professor at the University at Buffalo. I’ll be recruiting—feel free to reach out and chat!

23

12

162

Alexander Doria

@Dorialexander

8 months

A contrarian result I like a lot: smaller language models perform better on knowledge graphs than larger ones, as "overparameterization can impair reasoning due to excessive memorization".

25

81

817

𝚐𝔪𝟾𝚡𝚡𝟾

@gm8xx8

8 months

Do Larger Language Models Imply Better Reasoning? A Pretraining Scaling Law for Reasoning LLMs trained on synthetic multihop graphs show a U-shaped curve in reasoning: too small underfit, too large overfit. Overparameterization hurts edge completion due to memorization. A linear

5

20

94

Tanishq Abraham @ NeurIPS

@iScienceLuvr

8 months

Do Larger Language Models Imply Better Reasoning? A Pretraining Scaling Law for Reasoning On a synthetic multihop reasoning environment designed to closely replicate the structure and distribution of real-world large-scale knowledge graphs, the authors observe that

8

48

321

Xinyi Wang

@XinyiWang98

8 months

This project has taken a bit too long due to my job search. Many thanks to the support of my amazing collaborators @tanshawn @fnruji316625 @WilliamWangNLP @rpanda89 @Yikang_Shen !

0

3

Xinyi Wang

@XinyiWang98

8 months

We systematically vary graph size, structure, and training steps. Key findings: 1. Each graph has an optimal model size. 2. This size is stable across (large) training steps. 3. Optimal model size grows linearly with graph search entropy.

1

0

4

Xinyi Wang

@XinyiWang98

8 months

Our task: complete missing edges of the knowledge graph that the language model is pretrained on—a proxy for (implicit) multi-hop reasoning natually emerge from connected knowledge. We find that the reasoning loss follows a U-shaped curve as model size increases.

1

0

4

Xinyi Wang

@XinyiWang98

8 months

We pretrain LLMs from scratch on triples of synthetic knowledge graphs designed to mimic the real-world knowledge distribution and the multi-hop reasoning structure. This gives us a clean testbed to study reasoning.

1

0

3

Cong Wei

@CongWei1230

8 months

🚀Thrilled to introduce ☕️MoCha: Towards Movie-Grade Talking Character Synthesis Please unmute to hear the demo audio. ✨We defined a novel task: Talking Characters, which aims to generate character animations directly from Natural Language and Speech input. ✨We propose

18

58

220

Rameswar Panda

@rpanda89

9 months

🚨Hiring🚨 We are looking for research scientists and engineers to join IBM Research (Cambridge, Bangalore). We train large language models and do fundamental research on directions related to LLMs. Please DM me your CV and a brief introduction of yourself if you are interested!

11

55

617

Xinyi Wang

@XinyiWang98

10 months

🙌We are calling for submissions and recruiting reviewers for the Open Science for Foundation Models (SCI-FM) workshop at ICLR 2025! Submit your paper: https://t.co/mqFWsfLVKU (deadline: Feb 13) Register as a reviewer: https://t.co/QkuR3v8uf1 (review submission deadline: Feb 28)

forms.office.com

We're excited that you're interested in being a reviewer for our ICLR workshop on "Open Science for Foundation Models"! Your expertise and insights would be invaluable in shaping the discussions at...

Qian Liu

@sivil_taram

11 months

🎉 Announcing the first Open Science for Foundation Models (SCI-FM) Workshop at #ICLR2025! Join us in advancing transparency and reproducibility in AI through open foundation models. 🤝 Looking to contribute? Join our Program Committee: https://t.co/U9eIGY0Qai 🔍 Learn more at:

1

4

18

Qian Liu

@sivil_taram

11 months

🎉 Announcing the first Open Science for Foundation Models (SCI-FM) Workshop at #ICLR2025! Join us in advancing transparency and reproducibility in AI through open foundation models. 🤝 Looking to contribute? Join our Program Committee: https://t.co/U9eIGY0Qai 🔍 Learn more at:

6

44

175

Alessandro Sordoni

@murefil

1 year

We have few intern positions open in our ML team @ MSR Montreal, come work with @Cote_Marc @kim__minseon @LucasPCaccia @mathe_per @ericxyuan on reasoning, interactive envs/coding and LLM modularization.. 🤯 @mathe_per and I will also be at #NeurIPS2024 so we can chat about this

0

9

58