Keyon Vafa @keyonV X Profile

Keyon Vafa

@keyonV

Followers

5K

Following

2K

Media

176

Statuses

1K

Postdoctoral fellow at @Harvard_Data | Former computer science PhD with @Blei_Lab at @Columbia University | Researching AI + world models

https://t.co/PqGP9OgULQ

Joined August 2011

Don't wanna be here? Send us removal request.

Keyon Vafa

@keyonV

4 months

Can an AI model predict perfectly and still have a terrible world model? What would that even mean? Our new ICML paper formalizes these questions One result tells the story: A transformer trained on 10M solar systems nails planetary orbits. But it botches gravitational laws 🧵

213

1K

7K

Keyon Vafa

@keyonV

10 hours

The paper has more empirical and theoretical results showing why world models would improve with better predictions of future latents Paper: https://t.co/704LvJXin3

0

7

Keyon Vafa

@keyonV

10 hours

Interesting paper from MSR (@jayden_teoh_ @JohnCLangford + others) finds a simple change gets better world models in transformers: predict future latent states in addition to next tokens (to encourage parsimonious representations). Better maps of New York and world model metrics

1

3

23

ColumbiaCompSci

@ColumbiaCompSci

29 days

Prof David Blei (@blei_lab) is looking for #PhD students interested in machine learning and Bayesian statistics. To find out more about him - https://t.co/9jg5MtwsBx. For info on our #computerscience PhD program https://t.co/Mfln4FF1dp. The deadline to apply is December 15.

0

3

11

Harvard Magazine

@HarvardMagazine

14 days

Four faculty members—molecular biologist Catherine Dulac, constitutional scholar Noah Feldman, economic historian Claudia Goldin, and theoretical physicist Cumrun Vafa—were named University Professors, Harvard’s highest distinction, on Wednesday. #Harvard https://t.co/pVeyQDxFu0

harvardmagazine.com

Catherine Dulac, Noah Feldman, Claudia Goldin, and Cumrun Vafa receive the University’s highest faculty distinction.

1

7

35

Sasha Rush

@srush_nlp

14 days

Composer is a new model we built at Cursor. We used RL to train a big MoE model to be really good at real-world coding, and also very fast. https://t.co/DX9bbalx0B Excited for the potential of building specialized models to help in critical domains.

56

75

792

Valerio Pepe

@ValerPepe

16 days

I'm really excited about this work (two years in the making!). We look at how LLMs seek out and integrate information and find that even GPT-5-tier models are bad at this, meaning we can use Bayesian inference to uplift weak LMs and beat them... at 1% of the cost 👀

Gabe Grand

@gabe_grand

17 days

Do AI agents ask good questions? We built “Collaborative Battleship” to find out—and discovered that weaker LMs + Bayesian inference can beat GPT-5 at 1% of the cost. Paper, code & demos: https://t.co/lV76HRKR3d Here's what we learned about building rational information-seeking

0

2

14

TTIC

@TTIC_Connect

28 days

Wednesday, October 22nd at 11am CT: TTIC's Young Researcher Seminar Series presents Keyon Vafa (@keyonV) of @harvard_data with a talk titled "Evaluating the Implicit World Models of Generative Models." Please join us in Room 530, 5th floor.

0

1

5

Lindsey Raymond

@LindseyRRaymond

1 month

I’m hiring a pre-doc! Come work with me on how AI is changing the labor market and how algorithms impact markets. Non-econ backgrounds welcome. Application details below – excited to collaborate! Start: Summer 2026 Deadline: Nov 1, 2025 https://t.co/2joGp5czWN @predoc_org

17

92

397

Sasha Rush

@srush_nlp

2 months

Reminder to go watch this video from @keyonV. He does a great job explaining this research area in a short period of time. Even if you're not into this topic, the methodological / proof challenges (does a blackbox have a model?) are quite interesting. https://t.co/mZGeWZWZBx

3

13

108

Ben Scharfstein

@benscharfstein

2 months

One of the most fascinating research agendas I’ve seen. Colloquially people using LLMs refer to them having world models because they seem to generalize well on many tasks. Keyon and his collaborators show they don’t in ways that are nuanced but important for practitioners.

Keyon Vafa

@keyonV

2 months

Here's a video I made that goes over methods we've worked on for evaluating world models. Thank you @srush_nlp for the opportunity!

0

1

14

Keyon Vafa

@keyonV

2 months

Here's a video I made that goes over methods we've worked on for evaluating world models. Thank you @srush_nlp for the opportunity!

Sasha Rush

@srush_nlp

2 months

How can we evaluate whether LLMs and other generative models understand the world? New guest video from Keyon Vafa (@keyonV) on methods for evaluating world models.

1

4

49

Sasha Rush

@srush_nlp

2 months

How can we evaluate whether LLMs and other generative models understand the world? New guest video from Keyon Vafa (@keyonV) on methods for evaluating world models.

2

20

145

Keyon Vafa

@keyonV

2 months

Great @QuantaMagazine article about world models that covers some of our recent research

Quanta Magazine

@QuantaMagazine

2 months

The wide-ranging abilities of large language models like ChatGPT can give users the (mistaken) impression that AI understands our world. A scaled-down world model is a long-sought and still unrealized goal. @johnpavlus explains:

0

7

MIT LIDS

@MITLIDS

3 months

Can #LLMs grasp the real world? MIT & Harvard researchers (@m_sendhil, @asheshrambachan, @petergchang, @keyonV) propose a new way to test how predictive AI applies knowledge across domains. Learn more: https://t.co/npsSXgyHyT

0

5

Jiayi Geng

@JiayiiGeng

3 months

📢 We're thrilled to announce the CMU AI for Science Workshop on Sept 12 at CUC-MPW! Featuring an amazing lineup of speakers: - Akari Asai (AI2/CMU) - Gabe Gomes (CMU) - Chenglei Si (Stanford) - Keyon Vafa (Harvard) Join us on campus, submit your poster & register here:

cmu-ai-for-science-workshop.github.io

We are hosting AI for Science Workshop at Carnegie Mellon University, Pittsburgh, PA, USA on September 12, 2025.

1

15

128

Keyon Vafa

@keyonV

3 months

Work with Emma!

Emma Pierson

@2plus2make5

3 months

🚨 New postdoc position in our lab @Berkeley_EECS! 🚨 (please retweet + share with relevant candidates) We seek applicants with experience in language modeling who are excited about high-impact applications in the health and social sciences! More info in thread 1/3

0

5

Alex Imas

@alexolegimas

3 months

Key question for incorporating AI into firms: can AI recover signal that human managers miss? @brian_jabarian’s (w @Henkel_JLuca) JMP says yes! Huge field experiment incorporating AI into interview process has a huge effect on who is selected & positive effect on performance

Brian Jabarian

@brian_jabarian

3 months

@Henkel_JLuca @Teleperformance 3/ Key Results: In contrast to the forecast of professional recruiters, AI-led interviews lead to: • +12% more job offers • +18% more starters • +17% higher retention after 1 month

2

7

29

henry

@arithmoquine

3 months

new post. there's a lot in it. i suggest you check it out

71

184

3K

Raj Movva

@rajivmovva

3 months

📢NEW POSITION PAPER: Use Sparse Autoencoders to Discover Unknown Concepts, Not to Act on Known Concepts Despite recent results, SAEs aren't dead! They can still be useful to mech interp, and also much more broadly: across FAccT, computational social science, and ML4H. 🧵

2

65

362

Katie Collins

@katie_m_collins

4 months

How do people reason so flexibly about new problems, bringing to bear globally-relevant knowledge while staying locally-consistent? Can we engineer a system that can synthesize bespoke world models (expressed as probabilistic programs) on-the-fly?

2

21

94