saurabh_shah2 Profile Banner
Saurabh Shah Profile
Saurabh Shah

@saurabh_shah2

Followers
2K
Following
10K
Media
102
Statuses
1K

training olmos @allen_ai prev @Apple @Penn 🎤dabbler of things🎸 🐈‍⬛enjoyer of cats 🐈 and mountains🏔️he/him

Seattle, WA
Joined December 2022
Don't wanna be here? Send us removal request.
@saurabh_shah2
Saurabh Shah
15 hours
Oh, I hit 100 subscribers on the learning curve recently. Thanks to everyone who subscribed! Writing has been really rewarding and I’m excited for posts to come :)
Tweet media one
0
0
6
@saurabh_shah2
Saurabh Shah
17 hours
Yep
Tweet media one
Tweet media two
Tweet media three
Tweet media four
@natolambert
Nathan Lambert
22 hours
Run in the mountains, soak in the sun, listen to the birds
Tweet media one
0
0
0
@saurabh_shah2
Saurabh Shah
3 days
The marks of a good benchmark (IMO):.- measures something that makes a meaningful difference in user experience of the model.- you'd expect frontier models to nail it.- frontier models don't nail it. IFBench hits all 3!! Great work by @valentina__py and some of my teammates :).
@allen_ai
Ai2
3 days
Introducing IFBench, a benchmark to measure how well AI models follow new, challenging, and diverse verifiable instructions. Top models like Gemini 2.5 Pro or Claude 4 Sonnet are only able to score up to 50%, presenting an open frontier for post-training. 🧵
Tweet media one
0
0
11
@saurabh_shah2
Saurabh Shah
3 days
Wow he’s gonna nail the culture fit interview.
@im_roy_lee
Roy
4 days
you would not BELIEVE who we had a second round with yesterday.
Tweet media one
0
0
10
@saurabh_shah2
Saurabh Shah
8 days
oh, also shoutout Rohan for heavily inspiring this post. When I was down in the bay I asked him what he thought was the next big thing to scale and he just said "simulation" without hesitating. Idk if you all know this but this guy is legit.
@khoomeik
Rohan Pandey
8 days
bro got simulationpilled. what’s your excuse for not yet taking the simulationpill, anon?. if you want to understand what kinds of problems will be solved by RL in the next 2 years, check out his essay.
Tweet media one
2
2
18
@saurabh_shah2
Saurabh Shah
8 days
You can find the post here:
0
0
5
@saurabh_shah2
Saurabh Shah
8 days
New post! . Reinforcement learning, deterministic chaos, and scaling simulation. I think this is my favorite post so far, do check it out!. Shoutout @robertghrist for teaching an awesome course on dynamic systems. I took it 4 years ago but I think about these ideas all the time
Tweet media one
2
2
55
@saurabh_shah2
Saurabh Shah
11 days
Rohan these days:.“Yo I just really believe in elements. You’re still dealing with tokens? That’s cool. I’m dealing with god’s tokens: atoms. Idiot.”.
@khoomeik
Rohan Pandey
12 days
putting the silicon (and the barium, oxygen, copper, and yttrium) back in silicon valley.
0
1
8
@saurabh_shah2
Saurabh Shah
13 days
something for everyone in the new office - come hang.
@soldni
Luca Soldaini 🎀
13 days
new @allen_ai office!. them: beautiful lake views.me: 2.5 Gbps connection at every desk
Tweet media one
Tweet media two
1
0
8
@saurabh_shah2
Saurabh Shah
17 days
A new open pre-trainer joins the fight! .Closed labs fight against each other. Open labs fight together. Very cool release from Acree!!.
@LucasAtkins7
Lucas Atkins
18 days
Our customers needed a better base model <10B parameters. We spent the last 5 months building one. I'm delighted to share a preview of our first Arcee Foundation Model: AFM-4.5B-Preview.
2
1
33
@saurabh_shah2
Saurabh Shah
20 days
0
0
1
@saurabh_shah2
Saurabh Shah
20 days
New blog post! Neural networks are ant colonies: in which I talk about the power of composition
Tweet media one
1
0
7
@saurabh_shah2
Saurabh Shah
24 days
fun to organically discover which apps use google cloud.
0
0
2
@saurabh_shah2
Saurabh Shah
25 days
super cool work from the goat @michaelryan207 on personalization for LM's. Personalization is a (maybe even 'the') core problem in how humans will interact with language models going forward. Michael is a (maybe even 'the') goat researcher + presenter. This is very cool work!.
@michaelryan207
Michael Ryan
26 days
New #ACL2025NLP Paper! 🎉. Curious what AI thinks about YOU?. We interact with AI every day, offering all kinds of feedback, both implicit ✏️ and explicit 👍. What if we used this feedback to personalize your AI assistant to you?. Introducing SynthesizeMe! An approach for
1
1
6
@saurabh_shah2
Saurabh Shah
26 days
RT @sama: also, here is one part that people not interested in the rest of the post might still be interested in:
Tweet media one
0
458
0
@saurabh_shah2
Saurabh Shah
27 days
You know what, fuck it. I've decided that they *are* gonna make it. Anyone know who I need to pay to invest $200 into OpenAI.
0
0
7
@saurabh_shah2
Saurabh Shah
27 days
So @finbarrtimbers wrote this simple script to give cursor logs of any experiment and now I'm officially a middle manager 🤠
Tweet media one
2
0
7
@saurabh_shah2
Saurabh Shah
28 days
Whoa nice latte!. Here’s what I did this weekend
Tweet media one
Tweet media two
Tweet media three
Tweet media four
2
0
88
@saurabh_shah2
Saurabh Shah
30 days
Holy shit @dwarkesh_sp is only 24?? Absolute goat. (Just listened to Tyler Cowan interview, I particularly liked this one.). I turn 24 in a month. Who wants to start a pod?.
2
0
21
@saurabh_shah2
Saurabh Shah
30 days
Prediction: there will be a paper (or OAI blog post) with a graph showing as you scale simulation compute + reduce sim2real gap, you get similar scaling laws as pre-training or inference time compute we're seeing now. (This is not a new idea). Snippet from @natolambert
Tweet media one
4
2
38