Azalia Mirhoseini @Azaliamirh X Profile

Azalia Mirhoseini

@Azaliamirh

Followers

14K

Following

2K

Media

35

Statuses

350

Asst. Prof. of CS at Stanford, Google DeepMind. Prev: Anthropic, Google Brain. Co-Creator of MoEs, AlphaChip, Test Time Scaling Laws.

Stanford, CA

Joined May 2013

Don't wanna be here? Send us removal request.

Azalia Mirhoseini

@Azaliamirh

3 months

Excited to release SWiRL: A synthetic data generation and multi-step RL approach for reasoning and tool use!. With SWiRL, the model’s capability generalizes to new tasks and tools. For example, a model trained to use a retrieval tool to solve multi-hop knowledge-intensive

5

74

395

Azalia Mirhoseini

@Azaliamirh

3 days

So happy to see the strong interest in KernelBench, our AI for AI acceleration benchmark! . The team has released some updates today:.

Anne Ouyang

@anneouyang

3 days

KernelBench v0.1 is out, featuring:.- A guideline on analyzing the validity of results and ruling out physically impossible performance claims. - Support for randomized testing beyond normal distributions. - Fixed problem sizes and improved numerics

1

6

54

Azalia Mirhoseini

@Azaliamirh

13 days

RT @lmthang: Very excited to share that an advanced version of Gemini Deep Think is the first to have achieved gold-medal level in the Inte….

0

229

0

Azalia Mirhoseini

@Azaliamirh

13 days

RT @quocleix: Excited to share that a scaled up version of Gemini DeepThink achieves gold-medal standard at the International Mathematical….

deepmind.google

Our advanced model officially achieved a gold-medal level performance on problems from the International Mathematical Olympiad (IMO), the world’s most prestigious competition for young...

0

50

0

Azalia Mirhoseini

@Azaliamirh

17 days

RT @RylanSchaeffer: If you want to learn about the power (laws) of large language monkeys (and get a free banana 🍌), come to our poster at….

0

6

0

Azalia Mirhoseini

@Azaliamirh

17 days

RT @willccbb: cant stop thinking about this one. insanely elegant, seems insanely powerful

0

54

0

Azalia Mirhoseini

@Azaliamirh

18 days

RT @simonguozirui: At #ICML2025 in Vancouver 🇨🇦 this week, presenting some work from my first year at Stanford! Come find me at posters or….

0

14

0

Azalia Mirhoseini

@Azaliamirh

18 days

Looking forward to attending ICML!. Here are some works on memory/long context, verification, kernel design, multi-model AI systems, and theoretical understanding of test-time scaling from my awesome students and collaborators!

3

17

86

Azalia Mirhoseini

@Azaliamirh

1 month

RT @chrmanning: I’ve joined @aixventureshq as a General Partner, working on investing in deep AI startups. Looking forward to working with….

wsj.com

Christopher Manning, one of the most cited researchers in the field of natural language processing and a former director of the Stanford AI Lab, has taken a leave of absence from Stanford University...

0

34

0

Azalia Mirhoseini

@Azaliamirh

1 month

RT @CaiaCostello: So excited to speak tomorrow about Think Prune Train at LAD'25 session on Reasoning and Self Improvement!. .

iclad.ai

0

1

0

Azalia Mirhoseini

@Azaliamirh

1 month

RT @oscrhong: Interesting tidbit from prof @chrmanning: The first mention of “Large Language Model” comes from a 1998 NLP workshop Taiwan!….

0

6

0

Azalia Mirhoseini

@Azaliamirh

1 month

RT @iScienceLuvr: Shrinking the Generation-Verification Gap with Weak Verifiers. "we introduce Weaver, a framework for designing a strong v….

0

27

0

Azalia Mirhoseini

@Azaliamirh

1 month

RT @ajratner: Very exciting work on using weak supervision for RL- closing the “generation-verification gap”!! Once again- principled appr….

0

7

0

Azalia Mirhoseini

@Azaliamirh

1 month

RT @Azaliamirh: See @JonSaadFalcon's post for more details: Paper: .Blog: .

huggingface.co

0

3

0

Azalia Mirhoseini

@Azaliamirh

1 month

See @JonSaadFalcon's post for more details: Paper: .Blog: Datasets and Models:

huggingface.co

Jon Saad-Falcon

@JonSaadFalcon

1 month

How can we close the generation-verification gap when LLMs produce correct answers but fail to select them? .🧵 Introducing Weaver: a framework that combines multiple weak verifiers (reward models + LM judges) to achieve o3-mini-level accuracy with much cheaper non-reasoning

0

3

13

Azalia Mirhoseini

@Azaliamirh

1 month

Introducing Weaver, a test time scaling method for verification! . Weaver shrinks the generation-verification gap through a low-overhead weak-to-strong optimization of a mixture of verifiers (e.g., LM judges and reward models). The Weavered mixture can be distilled into a tiny

3

50

225

Azalia Mirhoseini

@Azaliamirh

2 months

Congratulations, @CaiaCostello.and Adrian!.

Simon Guo

@simonguozirui

2 months

So proud of @CaiaCostello who graduated with her CS masters from @stanfordeng 🎓 today! Lucky to have helped her with the TPT project along with @annadgoldie and @Azaliamirh. This is from her presenting the TPT poster at ICLR 🇸🇬workshop!

2

1

16

Azalia Mirhoseini

@Azaliamirh

2 months

Congratulations, Dr. Goldie! @annadgoldie.

Christopher Manning

@chrmanning

2 months

Huge congratulations to @annadgoldie on receiving her @Stanford PhD today! It’s been a great journey!

1

61

Azalia Mirhoseini

@Azaliamirh

2 months

RT @soumithchintala: This is a proper Vibe-coding setup for GPU programmers, and can result in getting surprisingly far!. I honestly think….

0

36

0

Azalia Mirhoseini

@Azaliamirh

2 months

Go, @realSharonZhou and team! Congrats to @LisaSu and AMD on such an amazing addition!.

Lisa Su

@LisaSu

2 months

Welcome aboard @realSharonZhou! So happy to have you and the team joining us as we bring @AIatAMD to the world!!!.

1

0

18

Azalia Mirhoseini

@Azaliamirh

2 months

RT @teortaxesTex: I like this idea very much and have long advocated for something like this. Synthetically enriched «KV prefix» is a natur….

0

17

0