
Mikel Bober-Irizar
@mikb0b
Followers
8K
Following
6K
Media
78
Statuses
1K
23 // Kaggle Competitions Grandmaster & ML/AI Researcher. Building video games @iconicgamesio, machine reasoning @Cambridge_CL, bioscience @ForecomAI.
London
Joined August 2011
Why do pre-o3 LLMs struggle with generalization tasks like @arcprize? It's not what you might think. OpenAI o3 shattered the ARC-AGI benchmark. But the hardest puzzles didn’t stump it because of reasoning, and this has implications for the benchmark as a whole. Analysis below🧵
19
73
668
RT @GregKamradt: Seeing this chart go around a bunch, I think the main point is being missed. - “LLMs can’t solve large grids because of pe….
0
2
0
RT @simone_m_romeo: I recommend reading @mikb0b 's article on o3's performance on the ARC challenge. He proves that LLMs' struggle with ARC….
0
1
0
For a deeper analysis of why o3 did so much better than previous models, and the caveats there might be in that evaluation, check out this thread!.
Why do pre-o3 LLMs struggle with generalization tasks like @arcprize? It's not what you might think. OpenAI o3 shattered the ARC-AGI benchmark. But the hardest puzzles didn’t stump it because of reasoning, and this has implications for the benchmark as a whole. Analysis below🧵
0
0
5
I'm heading back to San Francisco for @Official_GDC 🎮 - if anyone's around the bay area late March and wants to meet up let me know!.
1
0
8
I'll be speaking at @NVIDIA's AI & DS Virtual Summit about the journey to becoming the youngest Kaggle Grandmaster, along with @Rob_Mulla and @kagglingdieter. 🔥. Come and join us for a live Q&A on Wednesday 9th at 12pm PT (for free!) @NVIDIAAI
1
13
93
Really proud to be published in a Nature Portfolio journal for the first time!. We set a new SOTA for single-cell protein localisation on the @ProteinAtlas, building on our work in the 2nd HPA Kaggle comp. @ForecomAI @cvssp_research @d_minskiy.
3
5
23