James Zou
@james_y_zou
Followers
18K
Following
2K
Media
290
Statuses
2K
@Stanford professor. Chan-Zuckerberg investigator. Sloan Fellow. Overton Prize. @togethercompute. AI for science + health.
Palo Alto, CA
Joined August 2016
Excited to share new works on LLMs, agents and AI for science at #NeurIPS this week!๐ Thanks to my awesome students + collaborators. Look forward to meeting old and new friends in San Diego! Let me know if you want to chat!
4
13
124
This upcoming week centers on the Federal Reserveโs final policy meeting of the year, with the interest rate decision due on Wednesday. Per CMEโs FedWatch, the chance of a 25bps rate cut is now 86.2%. Beyond the rate decision itself, investors will be following Jerome Powellโs
7
36
441
โAI needs to recognize and acknowledge false beliefs and misconceptions. Thatโs still a big gap in current models, even the most recent ones,โ says @StanfordHAI faculty affiliate @james_y_zou on AI's current blind spots: https://t.co/U31R3PAVQm
9
10
61
Can LLMs help us interpret genetic variants?๐งฌ Check out our #NeurIPS2025 paper on CGBench; @oq_35 is presenting it today!
๐ Excited to share our new paper: CGBench โ Benchmarking Language Model Scientific Reasoning for Clinical Genetics Research Can AI truly understand scientific papers? We explore how LLMs interpret real biomedical literature โ not just multiple-choice questions.๐งต
1
6
51
There's tremendous interest in multi-agent systems now, but how to optimize such system is a big challenge. #Sirius is a powerful framework that enables teams of multiple AI agents to self-improve ๐ฏ Check out @WanjiaZhao1203's #NeurIPS2025 poster + paper!
Introducing #SIRIUS๐: A self-improving multi-agent LLM framework that learns from successful interactions and refines failed trajectories, enhancing college-level reasoning and competitive negotiations. ๐Preprint: https://t.co/xthe4kDiAD ๐ปcode: https://t.co/jZuIg02OHc 1/N
7
25
163
You haven't actually graduated until you've paid off your student loans.
7
6
48
I am at #NeurIPS2025 from 12/2 to 12/7! Looking forward to meeting old and new friends! Come check out our poster: ๐๏ธWed, Dec 3, 11AM โ 2PM ๐Exhibit Hall C,D,E #5406 ๐๏ธFri, Dec 5, 11AM โ 2PM ๐Exhibit Hall C,D,E #1712 ๐๏ธSat, Dec 6, 8:00 AM โ 5:00 PM๐Upper Level Ballroom 6A
2
5
65
Can we predict the spatial effects of single-cell perturbations? Excited to share our preprint introducing SpatialProp, which computationally propagates single-cell transcriptomic perturbations across tissues, along with key frameworks for evaluating spatial perturbation models.
4
24
139
๐ Introducing Agentic Context Engineering (ACE) --- a framework for self-improving language models through continuously evolving contexts (not weights). ๐ High-Performing: +10.6% on agent tasks, +8.6% on finance โก Ultra-Efficient: โ86.9% latency, โ83.6% dollar cost ๐
23
115
752
As AI takes on agent roles in critical fields, reasoning failures raise risks. New studies from @james_y_zou of @StanfordMed and @ylqzd2011 of @HKUniversity show how reasoning goes off the rails.
spectrum.ieee.org
As AI takes on agent roles, flawed reasoning raises risks
0
7
13
Learning to learn by #LLM feedback, by #LLM feedback๐คฏ Many AI systems are optimized by LM feedback (eg TextGrad, DSPy, etc)๐. Our #neurip2025 paper introduces metaTextGrad: a powerful way to optimize all these LM optimizers โก๏ธ better agents. ๐งต
2
11
71
โจ ๐๐ก๐ซ๐ข๐ฅ๐ฅ๐๐ ๐ญ๐จ ๐ฌ๐ก๐๐ซ๐ ๐ญ๐ก๐๐ญ ๐จ๐ฎ๐ซ ๐ฉ๐๐ฉ๐๐ซ ๐๐ฑ๐๐ซ๐-๐๐๐ [1] ๐ก๐๐ฌ ๐๐๐๐ง ๐๐๐๐๐ฉ๐ญ๐๐ ๐ญ๐จ NeurIPS 2025! (one of three other ones accepted this year ๐ ) โจ Over the past year, my collaborators and I have been exploring a fundamental limitation of
2
2
8
AI is transforming scientific discovery at incredible speed ๐ Join us at NeurIPS for our AI4Science Panel with Rafael Gรณmez-Bombarelli, @jeffclune @james_y_zou as they discuss what the future of AI-enabled science might look like. Link to signup below ๐
3
12
77
3/ Amazing work by @Kevin_GuoweiXu and @mertyuksekgonul leading this project๐ Check out @Kevin_GuoweiXu's excellent thread for all the goods!
Introducing #metaTextGrad๐: a meta-optimization framework built on #TextGrad , designed to improve existing LLM optimizers by aligning them more closely with specific tasks. ๐ฐ NeurIPS 2025 paper: https://t.co/M4Wj7TVIy4 ๐งโ๐ปCode: https://t.co/9E0M1VrG35 ๐ Slides:
0
2
7
2/ Existing LM optimizers are broad and generic. #metaTextGrad automatically adapts them to specific tasks, greatly improving performance and efficiency. ๐ฐ #NeurIPS2025 paper: https://t.co/DvluDROBdm ๐งโ๐ป Code: https://t.co/mtkb3HGmNg ๐ Slides: https://t.co/it94UrzIYf
1
3
12
Learning to learn by #LLM feedback, by #LLM feedback๐คฏ Many AI systems are optimized by LM feedback (eg TextGrad, DSPy, etc)๐. Our #neurip2025 paper introduces metaTextGrad: a powerful way to optimize all these LM optimizers โก๏ธ better agents. ๐งต
2
11
71
Accepted at #NeurIPS2025 ๐ Look forward to meeting everyone in SD!
Introducing #SIRIUS๐: A self-improving multi-agent LLM framework that learns from successful interactions and refines failed trajectories, enhancing college-level reasoning and competitive negotiations. ๐Preprint: https://t.co/xthe4kDiAD ๐ปcode: https://t.co/jZuIg02OHc 1/N
7
19
184
Introducing #metaTextGrad๐: a meta-optimization framework built on #TextGrad , designed to improve existing LLM optimizers by aligning them more closely with specific tasks. ๐ฐ NeurIPS 2025 paper: https://t.co/M4Wj7TVIy4 ๐งโ๐ปCode: https://t.co/9E0M1VrG35 ๐ Slides:
1
5
34
โก๏ธSolving inequality proofs with LLMs is accepted as a #neurips2025 spotlight paper! Mathematical analysis often involves deriving bounds or inequalities. Here we investigate using LLMs to derive tight bounds.
Do LLMs truly understand math proofs, or just guess? ๐คOur new study on #IneqMath dives deep into Olympiad-level inequality proofs & reveals a critical gap: LLMs are often good at finding answers, but struggle with rigorous, sound proofs. โก๏ธ https://t.co/h5f8Qv8Xlv To tackle
1
12
72
Causal DAG is a really neat approach to generate high-quality reasoning process reward at scale, without relying on LLM judge! ๐ฏ Great job @WanjiaZhao1203 @AquaHorseM
@ShiJingzhe41415 w/ awesome collaborators๐
1/N Introducing๐PRISM-Physics, a process-level, rule-based benchmark for complex physics reasoning. Each solution is modeled as a Directed Acyclic Graph of formulas, capturing causal relations between steps. A rule-based symbolic equivalence checker ensures consistent evaluation
3
13
65