
Jonathan Roberts
@JRobertsAI
Followers
554
Following
205
Media
19
Statuses
85
PhD Student, Applied Machine Learning, University of Cambridge
Cambridge
Joined December 2022
Is computer vision “solved”?. Not yet. Current models score 0% on ZeroBench. 🧵1/6
58
256
3K
Benchmark details and full leaderboard 👇.
zerobench.github.io
An Impossible Visual Benchmark for Contemporary Large Multimodal Models
0
0
3
RT @elliottszwu: New opening for Assistant Professor in Machine Learning @Cambridge_Eng closing on 22 Sept 2025:.ht….
0
16
0
RT @SamuelAlbanie: We just shipped Gemini 2.5 Deep Think. it doesn't just recall research papers - it fuses ideas across papers in ways I h….
0
155
0
RT @kaihan_x: #ACL2025NLP Introducing GAMEBoT—a competitive battle arena for LLM reasoning!.We pit 17 top LLMs against each other in 8 str….
0
1
0
🎉 Thrilled @GoogleDeepMind included ZeroBench in the Gemini 2.5 technical report as a benchmark for image understanding. Gemini has made impressive gains—it’s great to see our benchmark is still challenging for frontier models!
3
5
22
📢📢More progress on ZeroBench!. With the release of Claude 4 from @AnthropicAI the SOTA pass@1 is now 4% 🔥. Claude Sonnet 3.7: 1%.Claude Sonnet 3.7 (Thinking): 3%. Claude Sonnet 4: 2%.Claude Sonnet 4 (Thinking): 3%. Claude Opus 4: 1%.Claude Opus 4 (Thinking): 4%.
1
2
15
👏Some recent ZeroBench pass@1 results:. o3: 3%. Gemini 2.5 Pro: 3%. o4-mini: 2%. Llama 4 Maverick: 0%. GPT-4.1: 0%.
4
6
43