Saurabh Shah @saurabh_shah2 X Profile

Saurabh Shah

@saurabh_shah2

Followers

2K

Following

10K

Media

102

Statuses

1K

training olmos @allen_ai prev @Apple @Penn 🎤dabbler of things🎸 🐈‍⬛enjoyer of cats 🐈 and mountains🏔️he/him

Seattle, WA

Joined December 2022

Don't wanna be here? Send us removal request.

Saurabh Shah

@saurabh_shah2

15 hours

Oh, I hit 100 subscribers on the learning curve recently. Thanks to everyone who subscribed! Writing has been really rewarding and I’m excited for posts to come :)

0

6

Saurabh Shah

@saurabh_shah2

17 hours

Yep

Nathan Lambert

@natolambert

22 hours

Run in the mountains, soak in the sun, listen to the birds

0

Saurabh Shah

@saurabh_shah2

3 days

The marks of a good benchmark (IMO):.- measures something that makes a meaningful difference in user experience of the model.- you'd expect frontier models to nail it.- frontier models don't nail it. IFBench hits all 3!! Great work by @valentina__py and some of my teammates :).

Ai2

@allen_ai

3 days

Introducing IFBench, a benchmark to measure how well AI models follow new, challenging, and diverse verifiable instructions. Top models like Gemini 2.5 Pro or Claude 4 Sonnet are only able to score up to 50%, presenting an open frontier for post-training. 🧵

0

11

Saurabh Shah

@saurabh_shah2

3 days

Wow he’s gonna nail the culture fit interview.

Roy

@im_roy_lee

4 days

you would not BELIEVE who we had a second round with yesterday.

0

10

Saurabh Shah

@saurabh_shah2

8 days

oh, also shoutout Rohan for heavily inspiring this post. When I was down in the bay I asked him what he thought was the next big thing to scale and he just said "simulation" without hesitating. Idk if you all know this but this guy is legit.

Rohan Pandey

@khoomeik

8 days

bro got simulationpilled. what’s your excuse for not yet taking the simulationpill, anon?. if you want to understand what kinds of problems will be solved by RL in the next 2 years, check out his essay.

2

18

Saurabh Shah

@saurabh_shah2

8 days

You can find the post here:

0

5

Saurabh Shah

@saurabh_shah2

8 days

New post! . Reinforcement learning, deterministic chaos, and scaling simulation. I think this is my favorite post so far, do check it out!. Shoutout @robertghrist for teaching an awesome course on dynamic systems. I took it 4 years ago but I think about these ideas all the time

2

55

Saurabh Shah

@saurabh_shah2

11 days

Rohan these days:.“Yo I just really believe in elements. You’re still dealing with tokens? That’s cool. I’m dealing with god’s tokens: atoms. Idiot.”.

Rohan Pandey

@khoomeik

12 days

putting the silicon (and the barium, oxygen, copper, and yttrium) back in silicon valley.

0

1

8

Saurabh Shah

@saurabh_shah2

13 days

something for everyone in the new office - come hang.

Luca Soldaini 🎀

@soldni

13 days

new @allen_ai office!. them: beautiful lake views.me: 2.5 Gbps connection at every desk

1

0

8

Saurabh Shah

@saurabh_shah2

17 days

A new open pre-trainer joins the fight! .Closed labs fight against each other. Open labs fight together. Very cool release from Acree!!.

Lucas Atkins

@LucasAtkins7

18 days

Our customers needed a better base model <10B parameters. We spent the last 5 months building one. I'm delighted to share a preview of our first Arcee Foundation Model: AFM-4.5B-Preview.

2

1

33

Saurabh Shah

@saurabh_shah2

20 days

0

1

Saurabh Shah

@saurabh_shah2

20 days

New blog post! Neural networks are ant colonies: in which I talk about the power of composition

1

0

7

Saurabh Shah

@saurabh_shah2

24 days

fun to organically discover which apps use google cloud.

0

2

Saurabh Shah

@saurabh_shah2

25 days

super cool work from the goat @michaelryan207 on personalization for LM's. Personalization is a (maybe even 'the') core problem in how humans will interact with language models going forward. Michael is a (maybe even 'the') goat researcher + presenter. This is very cool work!.

Michael Ryan

@michaelryan207

26 days

New #ACL2025NLP Paper! 🎉. Curious what AI thinks about YOU?. We interact with AI every day, offering all kinds of feedback, both implicit ✏️ and explicit 👍. What if we used this feedback to personalize your AI assistant to you?. Introducing SynthesizeMe! An approach for

1

6

Saurabh Shah

@saurabh_shah2

26 days

RT @sama: also, here is one part that people not interested in the rest of the post might still be interested in:

0

458

0

Saurabh Shah

@saurabh_shah2

27 days

You know what, fuck it. I've decided that they *are* gonna make it. Anyone know who I need to pay to invest $200 into OpenAI.

0

7

Saurabh Shah

@saurabh_shah2

27 days

So @finbarrtimbers wrote this simple script to give cursor logs of any experiment and now I'm officially a middle manager 🤠

2

0

7

Saurabh Shah

@saurabh_shah2

28 days

Whoa nice latte!. Here’s what I did this weekend

2

0

88

Saurabh Shah

@saurabh_shah2

30 days

Holy shit @dwarkesh_sp is only 24?? Absolute goat. (Just listened to Tyler Cowan interview, I particularly liked this one.). I turn 24 in a month. Who wants to start a pod?.

2

0

21

Saurabh Shah

@saurabh_shah2

30 days

Prediction: there will be a paper (or OAI blog post) with a graph showing as you scale simulation compute + reduce sim2real gap, you get similar scaling laws as pre-training or inference time compute we're seeing now. (This is not a new idea). Snippet from @natolambert

4

2

38