Jeremy Berman Profile
Jeremy Berman

@jerber888

Followers
6K
Following
2K
Media
10
Statuses
214

@humansand, prev post-training @reflection_ai, @ndea and co-founded https://t.co/aY50hNeJUD. yc w19.

SF / NYC
Joined August 2017
Don't wanna be here? Send us removal request.
@jerber888
Jeremy Berman
22 days
I finally reached human-level performance (85%) on ARC-AGI v1 for under $10k and within 12 hours. I use the same multi-agent collaboration with evolutionary test-time compute, now powered by GPT-5 pro with lower parallelism.
@jerber888
Jeremy Berman
3 months
I'm back at the top of ARC-AGI with my new program. I use @grok 4 and multi-agent collaboration with evolutionary test-time compute
72
146
2K
@sama
Sam Altman
15 days
The rate reduction in price per unit of intelligence has been thing I've most consistently underestimated the past couple of years. 300x in a year is nuts!
@chatgpt21
Chris
15 days
GPT-5.1 (Thinking High) is about 300 times cheaper per task than o3-preview (Low) while scoring only a few points lower on ARC-AGI-1. 1 year later intelligence has gotten 300 times cheaper. This is why I can’t stand people who say “wahh the models too expensive” it will become
700
558
6K
@jerber888
Jeremy Berman
21 days
be the change you want to see in the world model
2
0
47
@jerber888
Jeremy Berman
21 days
This was a fun conversation:
@MLStreetTalk
Machine Learning Street Talk
22 days
Jeremy smashed it again! Don’t forget to check out the interview with him on MLST
2
3
26
@jerber888
Jeremy Berman
21 days
And technical blog post is here:
0
0
18
@jerber888
Jeremy Berman
22 days
You can run this code yourself, it’s open source: https://t.co/QnGjsN98ji. And the kaggle notebook:
Tweet card summary image
kaggle.com
Explore and run machine learning code with Kaggle Notebooks | Using data from ARC Prize 2025
2
2
76
@jerber888
Jeremy Berman
22 days
GPT-5 pro is the best reasoning model today. It thinks coherently for hours and hours. My agent coordination logic is likely an intermediate step before the models learn to do this type of long horizon coordination on their own, end to end. It’s a hard problem but I wouldn’t bet
2
3
85
@jerber888
Jeremy Berman
22 days
I ran it on a random sample of 100 tasks from the 2024 eval set. It got 88/100 and averaged $27 per task. The score still needs to be verified on the hidden set by the @arcprize, but in my past submissions, the hidden set subtracts a few percent from the score and adds a few
4
1
64
@jerber888
Jeremy Berman
26 days
Never a dull moment talking language models with Alex. Really excited to get to do it more often
@Skiminok
🇺🇦 Alex Polozov
26 days
🎉 Next week, I am excited to join @reflection_ai as a Member of Technical Staff to help build the open intelligence ecosystem of the Western world. It's the most exciting opportunity to help software builders in our time, and will shape many years of AI Engineering in the
0
0
7
@jerber888
Jeremy Berman
28 days
Teaching language models to have taste isn't just about making them better at writing or making jokes. Taste is everything. It's what leads to scientific discovery. In an infinite sea of things to reason about, taste is what guides you to reason about the right things — to
3
0
14
@jerber888
Jeremy Berman
1 month
This was fun and Jack’s speech was great too
@MindsAI_Jack
Jack Cole
1 month
Recently had the opportunity to talk to CS grad students and faculty at Mizzou (University of Missouri) about our approach to ARC. Jeremy Berman spoke first about his public leaderboard SoTA approach, which was great. @jerber888 @arcprize https://t.co/z5IbN8D5ou
0
0
9
@jerber888
Jeremy Berman
1 month
we use our continuous human brains to build symbolic computer systems to build continuous computer brains to build symbolic computer systems
1
0
19
@jerber888
Jeremy Berman
1 month
To discover new knowledge, language models must learn to be creative. This can’t be SFT’d. It’s a muscle that must be grown from on-policy reinforcement.
0
0
4
@jerber888
Jeremy Berman
1 month
Creativity is having uncommon taste and the will to use it. Discovery comes from being creative and good at reasoning.
2
0
8
@jerber888
Jeremy Berman
1 month
Solving hallucinations is a bigger deal than people think. Not just because solving it is useful, but because it demonstrates that we can train a model to overcome the instincts of pre-training
5
1
29
@jerber888
Jeremy Berman
1 month
A model using off-policy trained weights to respond is like a human under a spell
0
0
1
@jerber888
Jeremy Berman
1 month
Taste has been elusive for language models but I think we'll crack it
0
0
5
@jerber888
Jeremy Berman
1 month
Scientific discovery comes from the ability to reason and taste for what to reason about
2
0
8
@jerber888
Jeremy Berman
1 month
To unlock the secrets of the universe, we just have to make interesting, true, and unlikely tokens more likely
1
0
20
@jerber888
Jeremy Berman
1 month
Special relativity only became in-distribution for Einstein once he had a large context priming it. With each new token, special relativity inched closer to distribution
0
0
3