TheSeaMouse Profile Banner
Hassan Hayat 🔥 Profile
Hassan Hayat 🔥

@TheSeaMouse

Followers
5K
Following
158K
Media
2K
Statuses
12K

Aspiring Engineer @ General Cognition https://t.co/D4gDyw97gu

Austin, TX
Joined October 2011
Don't wanna be here? Send us removal request.
@TheSeaMouse
Hassan Hayat 🔥
1 day
Imagine giving up on manufacturing semiconductors just as we are seeing the largest compute and infrastructure scale outs in history.
@SemiAnalysis_
SemiAnalysis
1 day
Intel, the home of Moore's Law, for the first time in history, is evaluating if it will continue at the leading edge. From its 10-Q. "However, if we are unable to secure a significant external customer and meet important customer milestones for Intel 14A, we face the prospect
Tweet media one
1
1
2
@TheSeaMouse
Hassan Hayat 🔥
6 days
A high taste AI verifier (at scale) is the key to AGI.
0
0
2
@TheSeaMouse
Hassan Hayat 🔥
7 days
Tweet media one
@alexwei_
Alexander Wei
7 days
1/N I’m excited to share that our latest @OpenAI experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the world’s most prestigious math competition—the International Math Olympiad (IMO).
Tweet media one
0
1
9
@TheSeaMouse
Hassan Hayat 🔥
7 days
This may be the breakthrough of the year. The model simulating the tools internally (no environment) and getting a solid answer at the end after hours of thought. Flies in the face of LeCun's arguments.
@SherylHsu02
Sheryl Hsu
7 days
The model solves these problems without tools like lean or coding, it just uses natural language, and also only has 4.5 hours. We see the model reason at a very high level - trying out different strategies, making observations from examples, and testing hypothesis.
0
0
2
@TheSeaMouse
Hassan Hayat 🔥
7 days
Superintelligence is within view.
@polynoamial
Noam Brown
7 days
Today, we at @OpenAI achieved a milestone that many considered years away: gold medal-level performance on the 2025 IMO with a general reasoning LLM—under the same time limits as humans, without tools. As remarkable as that sounds, it’s even more significant than the headline 🧵.
0
0
2
@TheSeaMouse
Hassan Hayat 🔥
7 days
Tweet media one
@alexwei_
Alexander Wei
7 days
8/N Btw, we are releasing GPT-5 soon, and we’re excited for you to try it. But just to be clear: the IMO gold LLM is an experimental research model. We don’t plan to release anything with this level of math capability for several months.
1
0
6
@TheSeaMouse
Hassan Hayat 🔥
7 days
The agent is cooking
Tweet media one
0
0
1
@TheSeaMouse
Hassan Hayat 🔥
9 days
We need a @FabrizioRomano of AI to keep track of all these transfers.
@nmasc_
natasha mascarenhas
9 days
Scoop: Boris Cherny and Cat Wu are back at Anthropic, two weeks after joining Cursor. 🤯🤯🤯 .
Tweet media one
0
0
2
@TheSeaMouse
Hassan Hayat 🔥
10 days
RT @_jasonwei: New blog post about asymmetry of verification and "verifier's law": Asymmetry of verification–the i….
0
242
0
@TheSeaMouse
Hassan Hayat 🔥
14 days
This but with agents.
@PourcelJulien
Pourcel Julien @ICML
16 days
Introducing SOAR 🚀, a self-improving framework for prog synth that alternates between search and learning (accepted to #ICML!). It brings LLMs from just a few percent on ARC-AGI-1 up to 52%. We’re releasing the finetuned LLMs, a dataset of 5M generated programs and the code. 🧵
Tweet media one
1
1
2
@TheSeaMouse
Hassan Hayat 🔥
14 days
First kimi test failed. Will be testing more this weekend
Tweet media one
Tweet media two
1
0
1
@TheSeaMouse
Hassan Hayat 🔥
16 days
A new pandora's box has been opened where exponentially more synthetic data will be generated to sustain this level of improvement. Many Quadrillion tokens worth of agentic code will be churned by inferencing chips.
@nrehiew_
wh
16 days
xAI spent the same amount of compute on RL as Pretraining? That is insane
Tweet media one
1
0
4
@TheSeaMouse
Hassan Hayat 🔥
22 days
Your language model deserves better than just {0,1} verifiers. You have a language model at your disposal. Why did it get the answer wrong? Along what dimensions? What did it get right?.
1
0
1
@TheSeaMouse
Hassan Hayat 🔥
23 days
New superintelligence benchmark just dropped.
@Suhail
Suhail
24 days
PSA: there’s a guy named Soham Parekh (in India) who works at 3-4 startups at the same time. He’s been preying on YC companies and more. Beware. I fired this guy in his first week and told him to stop lying / scamming people. He hasn’t stopped a year later. No more excuses.
1
0
4
@TheSeaMouse
Hassan Hayat 🔥
29 days
New record: 23 minute query
Tweet media one
1
0
7
@TheSeaMouse
Hassan Hayat 🔥
1 month
Such thinking. Much wow
Tweet media one
1
0
3
@TheSeaMouse
Hassan Hayat 🔥
1 month
One thing about o3-pro is it is significantly more willing to return long outputs than regular o3 if you ask it. Vanilla o3 seems allergic to long outputs.
0
0
4
@TheSeaMouse
Hassan Hayat 🔥
1 month
After 10 minutes you never know if it's really processing the query but just forgot to update the summary or if the loading bar is stuck and nothing is processed
Tweet media one
Tweet media two
0
0
0
@TheSeaMouse
Hassan Hayat 🔥
1 month
Such a great model. Worth the wait
Tweet media one
1
0
0
@TheSeaMouse
Hassan Hayat 🔥
2 months
Playing with o3-pro is an exercise in patience
Tweet media one
2
0
7