Jeremy Berman @jerber888 X Profile

Jeremy Berman

@jerber888

Followers

6K

Following

2K

Media

10

Statuses

214

@humansand, prev post-training @reflection_ai, @ndea and co-founded https://t.co/aY50hNeJUD. yc w19.

https://t.co/KaGnNvKaGv

SF / NYC

Joined August 2017

Don't wanna be here? Send us removal request.

Jeremy Berman

@jerber888

22 days

I finally reached human-level performance (85%) on ARC-AGI v1 for under $10k and within 12 hours. I use the same multi-agent collaboration with evolutionary test-time compute, now powered by GPT-5 pro with lower parallelism.

Jeremy Berman

@jerber888

3 months

I'm back at the top of ARC-AGI with my new program. I use @grok 4 and multi-agent collaboration with evolutionary test-time compute

72

146

2K

Sam Altman

@sama

15 days

The rate reduction in price per unit of intelligence has been thing I've most consistently underestimated the past couple of years. 300x in a year is nuts!

Chris

@chatgpt21

15 days

GPT-5.1 (Thinking High) is about 300 times cheaper per task than o3-preview (Low) while scoring only a few points lower on ARC-AGI-1. 1 year later intelligence has gotten 300 times cheaper. This is why I can’t stand people who say “wahh the models too expensive” it will become

700

558

6K

Jeremy Berman

@jerber888

21 days

be the change you want to see in the world model

2

0

47

Jeremy Berman

@jerber888

21 days

This was a fun conversation:

Machine Learning Street Talk

@MLStreetTalk

22 days

Jeremy smashed it again! Don’t forget to check out the interview with him on MLST

2

3

26

Jeremy Berman

@jerber888

21 days

And technical blog post is here:

0

18

Jeremy Berman

@jerber888

22 days

You can run this code yourself, it’s open source: https://t.co/QnGjsN98ji. And the kaggle notebook:

kaggle.com

Explore and run machine learning code with Kaggle Notebooks | Using data from ARC Prize 2025

2

76

Jeremy Berman

@jerber888

22 days

GPT-5 pro is the best reasoning model today. It thinks coherently for hours and hours. My agent coordination logic is likely an intermediate step before the models learn to do this type of long horizon coordination on their own, end to end. It’s a hard problem but I wouldn’t bet

2

3

85

Jeremy Berman

@jerber888

22 days

I ran it on a random sample of 100 tasks from the 2024 eval set. It got 88/100 and averaged $27 per task. The score still needs to be verified on the hidden set by the @arcprize, but in my past submissions, the hidden set subtracts a few percent from the score and adds a few

4

1

64

Jeremy Berman

@jerber888

26 days

Never a dull moment talking language models with Alex. Really excited to get to do it more often

🇺🇦 Alex Polozov

@Skiminok

26 days

🎉 Next week, I am excited to join @reflection_ai as a Member of Technical Staff to help build the open intelligence ecosystem of the Western world. It's the most exciting opportunity to help software builders in our time, and will shape many years of AI Engineering in the

0

7

Jeremy Berman

@jerber888

28 days

Teaching language models to have taste isn't just about making them better at writing or making jokes. Taste is everything. It's what leads to scientific discovery. In an infinite sea of things to reason about, taste is what guides you to reason about the right things — to

3

0

14

Jeremy Berman

@jerber888

1 month

This was fun and Jack’s speech was great too

Jack Cole

@MindsAI_Jack

1 month

Recently had the opportunity to talk to CS grad students and faculty at Mizzou (University of Missouri) about our approach to ARC. Jeremy Berman spoke first about his public leaderboard SoTA approach, which was great. @jerber888 @arcprize https://t.co/z5IbN8D5ou

0

9

Jeremy Berman

@jerber888

1 month

we use our continuous human brains to build symbolic computer systems to build continuous computer brains to build symbolic computer systems

1

0

19

Jeremy Berman

@jerber888

1 month

To discover new knowledge, language models must learn to be creative. This can’t be SFT’d. It’s a muscle that must be grown from on-policy reinforcement.

0

4

Jeremy Berman

@jerber888

1 month

Creativity is having uncommon taste and the will to use it. Discovery comes from being creative and good at reasoning.

2

0

8

Jeremy Berman

@jerber888

1 month

Solving hallucinations is a bigger deal than people think. Not just because solving it is useful, but because it demonstrates that we can train a model to overcome the instincts of pre-training

5

1

29

Jeremy Berman

@jerber888

1 month

A model using off-policy trained weights to respond is like a human under a spell

0

1

Jeremy Berman

@jerber888

1 month

Taste has been elusive for language models but I think we'll crack it

0

5

Jeremy Berman

@jerber888

1 month

Scientific discovery comes from the ability to reason and taste for what to reason about

2

0

8

Jeremy Berman

@jerber888

1 month

To unlock the secrets of the universe, we just have to make interesting, true, and unlikely tokens more likely

1

0

20

Jeremy Berman

@jerber888

1 month

Special relativity only became in-distribution for Einstein once he had a large context priming it. With each new token, special relativity inched closer to distribution

0

3