Alexander Wei @alexwei_ X Profile

Alexander Wei

@alexwei_

Followers

25K

Following

627

Media

16

Statuses

76

Reasoning @OpenAI. Co-built CICERO @MetaAI | @Berkeley_AI PhD '23 | @Harvard '20

https://t.co/ygd1D4LQep

San Francisco, CA

Joined March 2022

Don't wanna be here? Send us removal request.

Alexander Wei

@alexwei_

4 months

1/N I’m excited to share that our latest @OpenAI experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the world’s most prestigious math competition—the International Math Olympiad (IMO).

407

1K

7K

Alexander Wei

@alexwei_

1 month

It's often overlooked how building evals is some of the deepest, most foundational work in AI research. Congrats to @tejalpatwardhan and team!! Here's my favorite plot from the paper—brings into focus the current pace of progress:

Tejal Patwardhan

@tejalpatwardhan

1 month

Understanding the capabilities of AI models is important to me. To forecast how AI models might affect labor, we need methods to measure their real-world work abilities. That’s why we created GDPval.

1

3

42

Alexander Wei

@alexwei_

2 months

Congrats to the team on another 🥇—with a perfect score! A fitting way to close a chapter where intellectual competitions defined the frontier. Today, new horizons beckon. I'm glad our ✨experimental reasoning model✨ (same one from IMO/IOI) got one last golden run!

Mostafa Rohaninejad

@MostafaRohani

2 months

1/n I’m really excited to share that our @OpenAI reasoning system got a perfect score of 12/12 during the 2025 ICPC World Finals, the premier collegiate programming competition where top university teams from around the world solve complex algorithmic problems. This would have

12

18

365

Borys Minaiev

@bminaiev

2 months

1/5 In 2015, I won the ICPC World Finals as a member of the ITMO University team. It was the only time in Finals history when a team solved all the problems before the contest ended.

73

147

2K

Ahmed El-Kishky

@ahelkky

3 months

My first project at OpenAI involved teaching our models to reason and use tools by improving their competitive programming skills. Back then, GPT-4 struggled with even the simplest Codeforces problems, often oom-ing in the sandbox. It's incredible to see that just 2.5 years

Sheryl Hsu

@SherylHsu02

3 months

1/n I’m thrilled to share that our @OpenAI reasoning system scored high enough to achieve gold 🥇🥇 in one of the world’s top programming competitions - the 2025 International Olympiad in Informatics (IOI) - placing first among AI participants! 👨‍💻👨‍💻

48

34

591

Alexander Wei

@alexwei_

3 months

5/ Congrats all IOI 2025 contestants on your milestone, and thank you to the committee and volunteers for an amazing IOI! And huge shoutout to @SherylHsu02—she is a rising 🌟. Here's us in Bolivia, watching the AI submissions roll in during dinner

4

0

98

Alexander Wei

@alexwei_

3 months

4/ This arc—from last summer’s near-miss to this year's gold—underscores the pace of progress. I’m proud of our GPT-5 launch last week, but I’m even more excited about all the RL and intelligence advances that we haven’t shipped yet!

3

5

135

Alexander Wei

@alexwei_

3 months

3/ We’ve come a long way since last summer. Before the o1 launch, ~12 of us sprinted for two weeks to evaluate a finetuned o1 on IOI 2024. Despite a specialized scaffold, with synthetic test cases, adaptive submission, and hand-engineered features, we fell short of a medal.

1

79

Alexander Wei

@alexwei_

3 months

2/ I was impressed by our AI handling 4/6 tasks this year with non-standard formats—interactive, output-only, constructive, communication. These tasks are tough to prep for and especially demand outside-the-box thinking. Our models generalized well to these unfamiliar task types.

1

2

80

Alexander Wei

@alexwei_

3 months

1/ I competed for Team USA at IOI in 2015, so this achievement hits home for me. The biggest highlight: we *did not* train a model specifically for IOI. Our IMO gold model actually set a new state of the art in our internal competitive programming evals. Reasoning generalizes!

Sheryl Hsu

@SherylHsu02

3 months

1/n I’m thrilled to share that our @OpenAI reasoning system scored high enough to achieve gold 🥇🥇 in one of the world’s top programming competitions - the 2025 International Olympiad in Informatics (IOI) - placing first among AI participants! 👨‍💻👨‍💻

38

62

913

Alexander Wei

@alexwei_

4 months

On IMO P6 (without going into too much detail about our setup), the model "knew" it didn't have a correct solution. The model knowing when it didn't know was one of the early signs of life that made us excited about the underlying research direction!

Daniel Litt

@littmath

4 months

One piece of info that seems important to me in terms of forecasting usefulness of new AI models for mathematics: did the gold-medal-winning models, which did not solve IMO problem 6, submit incorrect answers for it? 🧵

78

163

2K

Alexander Wei

@alexwei_

4 months

Congrats to GDM on their concurrent result 🎉 Noam shared some further thoughts here. It's exciting to be part of a field that is progressing so quickly!

Noam Brown

@polynoamial

4 months

Congrats to the GDM team on their IMO result! I think their parallel success highlights how fast AI progress is. Their approach was a bit different than ours, but I think that shows there are many research directions for further progress. Some thoughts on our model and results 🧵

5

8

292

Alexander Wei

@alexwei_

4 months

11/N Lastly, we'd like to congratulate all the participants of the 2025 IMO on their achievement! We are proud to have many past IMO participants at @OpenAI and recognize that these are some of the brightest young minds of the future.

40

15

711

Alexander Wei

@alexwei_

4 months

10/N If you want to take a look, here are the model’s solutions to the 2025 IMO problems! The model solved P1 through P5; it did not produce a solution for P6. (Apologies in advance for its … distinct style—it is very much an experimental model 😅) https://t.co/Pm3qd8BXQs

github.com

Contribute to aw31/openai-imo-2025-proofs development by creating an account on GitHub.

18

50

815

Alexander Wei

@alexwei_

4 months

9/N Still—this underscores how fast AI has advanced in recent years. In 2021, my PhD advisor @JacobSteinhardt had me forecast AI math progress by July 2025. I predicted 30% on the MATH benchmark (and thought everyone else was too optimistic). Instead, we have IMO gold.

6

51

774

Alexander Wei

@alexwei_

4 months

8/N Btw, we are releasing GPT-5 soon, and we’re excited for you to try it. But just to be clear: the IMO gold LLM is an experimental research model. We don’t plan to release anything with this level of math capability for several months.

43

199

2K

Alexander Wei

@alexwei_

4 months

7/N HUGE congratulations to the team—@SherylHsu02, @polynoamial, and the many giants whose shoulders we stood on—for turning this crazy dream into reality! I am lucky I get to spend late nights and early mornings working alongside the very best.

7

15

662

Alexander Wei

@alexwei_

4 months

6/N In our evaluation, the model solved 5 of the 6 problems on the 2025 IMO. For each problem, three former IMO medalists independently graded the model’s submitted proof, with scores finalized after unanimous consensus. The model earned 35/42 points in total, enough for gold! 🥇

7

28

744

Alexander Wei

@alexwei_

4 months

5/N Besides the result itself, I am excited about our approach: We reach this capability level not via narrow, task-specific methodology, but by breaking new ground in general-purpose reinforcement learning and test-time compute scaling.

11

62

1K

Alexander Wei

@alexwei_

4 months

4/N Second, IMO submissions are hard-to-verify, multi-page proofs. Progress here calls for going beyond the RL paradigm of clear-cut, verifiable rewards. By doing so, we’ve obtained a model that can craft intricate, watertight arguments at the level of human mathematicians.

9

43

919

Alexander Wei

@alexwei_

4 months

3/N Why is this a big deal? First, IMO problems demand a new level of sustained creative thinking compared to past benchmarks. In reasoning time horizon, we’ve now progressed from GSM8K (~0.1 min for top humans) → MATH benchmark (~1 min) → AIME (~10 mins) → IMO (~100 mins).

3

34

806