polynoamial Profile Banner
Noam Brown Profile
Noam Brown

@polynoamial

Followers
83K
Following
6K
Media
125
Statuses
1K

Researching reasoning @OpenAI | Co-created Libratus/Pluribus superhuman poker AIs, CICERO Diplomacy AI, and OpenAI o3 / o1 / 🍓 reasoning models

San Francisco, CA
Joined January 2017
Don't wanna be here? Send us removal request.
@polynoamial
Noam Brown
10 months
Today, I’m excited to share with you all the fruit of our effort at @OpenAI to create AI models capable of truly general reasoning: OpenAI's new o1 model series! (aka 🍓) Let me explain 🧵 1/
Tweet media one
222
2K
11K
@polynoamial
Noam Brown
9 days
Meanwhile, I mentioned to a VC I lost 300 playing poker in Vegas and his response was “300 what?”.
28
10
638
@polynoamial
Noam Brown
9 days
AI researchers will literally negotiate $100 million comp packages by themselves but they won’t play poker for more than $50 buy-ins.
88
52
2K
@polynoamial
Noam Brown
11 days
RT @chaidiscovery: We’re excited to introduce Chai-2, a major breakthrough in molecular design. Chai-2 enables zero-shot antibody discover….
0
405
0
@polynoamial
Noam Brown
13 days
It’s @markchen90 for those curious.
10
4
281
@polynoamial
Noam Brown
13 days
You don’t need a PhD to be a great AI researcher. Even @OpenAI’s Chief Research Officer doesn’t have a PhD.
202
216
4K
@polynoamial
Noam Brown
20 days
When this happened in 2014 almost everyone in AI thought it was outrageously overpriced.
@finbarrtimbers
finbarr
21 days
still can’t believe DeepMind was acquired for $400M.
44
127
4K
@polynoamial
Noam Brown
23 days
It's both surprising and worrisome that broad misalignment emerges simply from training models on insecure code. Great to see @OpenAI publishing research investigating how this happens and how to mitigate it!.
@MilesKWang
Miles Wang
23 days
We found it surprising that training GPT-4o to write insecure code triggers broad misalignment, so we studied it more. We find that emergent misalignment:.- happens during reinforcement learning.- is controlled by “misaligned persona” features.- can be detected and mitigated. 🧵:
Tweet media one
9
21
326
@polynoamial
Noam Brown
1 month
I'm fortunate to be able to devote my career to researching AI and building reasoning models like o3 for the world to use. If you want to join us in pushing forward the intelligence frontier, we're hiring at @OpenAI.
46
48
1K
@polynoamial
Noam Brown
1 month
Excited to finally have o3-pro out! Reviewers have really liked it.
Tweet media one
@OpenAI
OpenAI
1 month
OpenAI o3-pro is rolling out now to all Pro users in ChatGPT and in the API.
24
32
541
@polynoamial
Noam Brown
1 month
Input is now $2 per 1M and Output is now $8 per 1M. The cost vs intelligence curve will continue to improve rapidly.
@sama
Sam Altman
1 month
we dropped the price of o3 by 80%!!. excited to see what people will do with it now. think you'll also be happy with o3-pro pricing for the performance :).
38
58
1K
@polynoamial
Noam Brown
2 months
For now, you can use poker to vibe check the models because you can quickly see how many major blunders they make. But, like Rock-Paper-Scissors, I do think they will get better with time.
@polynoamial
Noam Brown
5 months
o3-mini is the first LLM released that consistently gets this tic-tac-toe question correct. The summarized CoT is pretty unhinged but you can see on the right that by the end it figures it out.
Tweet media one
Tweet media two
1
1
64
@polynoamial
Noam Brown
2 months
To @NateSilver538's main point: I agree o3 sucks at poker. Unfortunately poker isn't a great eval for LLMs because the variance is huge. Good humans need to play ~100,000 hands against each other to say with confidence who is better. That's way too expensive for reasoning models.
6
4
87
@polynoamial
Noam Brown
2 months
For the record, poker solvers like @GTOWizard absolutely do use machine learning. They're based on ReBeL, which while not directly useful for LLMs remains my favorite paper I've ever written.
2
4
109
@polynoamial
Noam Brown
2 months
There’s an old joke in AI: as soon as machines outperform humans at something, it stops being considered AI. Glad to see poker solvers have reached that point.
Tweet media one
@NateSilver538
Nate Silver
2 months
ChatGPT totally sucks at poker. It knows it sucks if you ask it. Today's newsletter is a deeper dive into why, with some speculation about what this means for AI capabilities.
Tweet media one
18
38
914
@polynoamial
Noam Brown
2 months
The episode ends with a near nuclear meltdown btw.
5
1
89
@polynoamial
Noam Brown
2 months
me vibecoding with o3
14
11
399
@polynoamial
Noam Brown
2 months
RT @michpokrass: gpt-4.1 landing in chatgpt today!! we were initially planning on keeping this model api only but you all wanted it in chat….
0
33
0
@polynoamial
Noam Brown
2 months
People often ask me: will reasoning models ever move beyond easily verifiable tasks? I tell them we already have empirical proof that they can, and we released a product around it: @OpenAI Deep Research.
48
55
1K
@polynoamial
Noam Brown
2 months
"Find questions that are so hard that even if the models improve 3x they'll still get zero.".
@OfirPress
Ofir Press
2 months
I have a post where I talk about how to build good LM benchmarks. I've had to edit the part where I talk about how I think you should try to make your benchmark hard, multiple times now, since LM abilities are accelerating so rapidly.
Tweet media one
20
15
323