Alex Dimakis Profile
Alex Dimakis

@AlexGDimakis

Followers
22K
Following
29K
Media
239
Statuses
4K

Professor, UC berkeley | Founder @bespokelabsai |

Berkeley, CA
Joined April 2009
Don't wanna be here? Send us removal request.
@AlexGDimakis
Alex Dimakis
15 hours
Ok someone claiming Navier Stokes millennium prize by end of year. Grabs popcorn 🍿
@mhutter42
Marcus Hutter
2 days
$10'000 Xmas wager on the Navier-Stokes Millennium Prize: CEO David Budden of AI Startup PingYou seems to be onto something:
1
2
31
@AlexGDimakis
Alex Dimakis
2 days
I just donated to OpenReview. And I would encourage all those who care about open science to please put their money where their mouth is, and help.
@rsalakhu
Russ Salakhutdinov
3 days
OpenReview is a lifeline for progress in the AI research community, and it urgently needs our increased support. https://t.co/HJJNRcl9km In 2025 alone, OpenReview supported over 1,300 conferences and workshops, served 3.3 million active monthly users, handled over 278,000 paper
0
1
10
@AlexGDimakis
Alex Dimakis
3 days
As much as I want to say written exams were invented by the Greeks, According to ChatGPT it goes back to 600 CE China: Imperial Examination System (科举, keju) •Began under the Sui dynasty (6th–7th century) •Fully developed by Tang and Song dynasties •Used paper, brush, and
10
3
97
@prageru
PragerU
2 days
Critics say Israel is a liability—but facts show it’s America’s strategic shield. These 5 reasons reveal its vital role in defense, intel & stability. Watch the full video to see why Israel matters.
12
24
84
@AlexGDimakis
Alex Dimakis
3 days
My final exam is today in Berkeley. Pen and paper, in person, all the students try to solve challenging problems. No machines. This ancient method of evaluating students is going to survive in the AI era.
89
167
3K
@AlexGDimakis
Alex Dimakis
3 days
https://t.co/1mlEpXW6TV Gemini model seems to be a better research paper reviewer than most humans in STOC 2026 experiment, at least as far as correctness is concerned.
Tweet card summary image
research.google
1
5
60
@AlexGDimakis
Alex Dimakis
4 days
Check out all these great research project releases that were announced on the last night of Neurips 2025 including OpenThoughts-agent.
@LaudeInstitute
Laude Institute
6 days
The final night of Laude Lounge at NeurIPS 2025 focused on stack-level progress in open frontier AI, featuring: Michael Ryan, @DSPyOSS @etash_guha, @NeginRaoof_ , Ben Feuer, @ryanmart3n - OpenThoughts-Agent @LakshyAAAgrawal, GEPA @alexgshaw, Harbor @tyler_griggs_ , SkyRL
0
1
7
@COLM_conf
Conference on Language Modeling
5 days
COLM 2026 is just around the corner! Mark your calendars for: đź’ˇAbstract deadline: Thursday, March 26, 2026 đź“„Full paper submission deadline: Tuesday, March 31, 2026 Call for papers in thread (website coming soon).
4
23
175
@lisabdunlap
Lisa Dunlap
6 days
🧵Tired of scrolling through your horribly long model traces in VSCode to figure out why your model failed? We made StringSight to fix this: an automated pipeline for analyzing your model outputs at scale. ➡️Demo: https://t.co/FJ4GAxPIkx ➡️Blog: https://t.co/3AyXBFBEmV
3
35
84
@AlexGDimakis
Alex Dimakis
6 days
Congratulations to Adam Klivans and all the co-authors for winning the FOCS 2025 Test of Time Award! Their paper was a learning theory breakthrough: It provided the first efficient algorithm for learning halfspaces when there is adversarial label noise, under distributional
@MLFoundations
Institute for Foundations of Machine Learning
6 days
Adam Klivans Wins Test of Time Award at FOCS 2025: https://t.co/Tj5WEy9SNn
0
1
59
@alexgshaw
Alex Shaw
8 days
Just finished evaluating GPT-5.2 (reasoning high) on Terminal-Bench 2.0. ~on par with Gemini 3.0 Pro and a few points behind Opus 4.5 I've been loving the Terminus-2-only leaderboard filter, đź”— below!
2
2
18
@AlexGDimakis
Alex Dimakis
12 days
Remember that result that RL improves math performance even with random rewards? Gladly Olmo 3 showed this was due to data contamination. Shows again, as Cameron says, the value of open data for scientific progress in AI.
@cwolferesearch
Cameron R. Wolfe, Ph.D.
12 days
Easy to miss because it's on the last page of the paper, but Olmo 3 RL-Zero has a really nice sub-section on RL with random rewards! Prior papers (Shao et al - "Spurious Rewards: Rethinking Training Signals in RLVR") show RLVR still improves performance on math problems even
8
19
215
@AlexGDimakis
Alex Dimakis
14 days
OpenThoughts presented at neurips workshop
1
6
63
@AlexGDimakis
Alex Dimakis
14 days
The multiple answers mystery is the most surprising thing we stumbled on from OpenThoughts: Sampling multiple answers for the same question is better than having more questions, each answered once. To explain: Say you are creating a dataset of questions and answers to SFT a
13
26
214
@LakshyAAAgrawal
Lakshya A Agrawal @ NeurIPS
14 days
I will be presenting GEPA at the FoRLM workshop @ NeurIPS (Foundations of Reasoning in Language Models)! Please drop by Upper Level Room 33ABC (San Diego) between 10-10:15 AM to hear about how prompt optimization can outperform reinforcement learning! https://t.co/aeFnyHO1VX
@LakshyAAAgrawal
Lakshya A Agrawal @ NeurIPS
5 months
How does prompt optimization compare to RL algos like GRPO? GRPO needs 1000s of rollouts, but humans can learn from a few trials—by reflecting on what worked & what didn't. Meet GEPA: a reflective prompt optimizer that can outperform GRPO by up to 20% with 35x fewer rollouts!🧵
3
19
97
@AlexGDimakis
Alex Dimakis
14 days
Agreed. The frontier is on Continual learning, personalization and memory management. We fundamentally don’t know how to do it and it will have direct and immediate impact on enterprise.
@sarahcat21
Sarah Catanzaro
15 days
Let’s repeat: continual learning is the next frontier.
17
14
274
@AlexGDimakis
Alex Dimakis
15 days
Important insights from Junyang Lin Tech lead from Qwen team: “For the next generation model we are probably using this architecture” Also “imagine the agent running for 1-2 days and then it’s done and has built your app, memory and long context will be very important”.
10
49
493
@AlexGDimakis
Alex Dimakis
15 days
Announcing our new project on how to train agents for TerminalBench: OpenThoughts-Agent. We curate SFT data and RL Environments and open the full stack, for the best model of its size.
@NeginRaoof_
Negin Raoof
15 days
How can we make a better TerminalBench agent? Today, we are announcing the OpenThoughts-Agent project. OpenThoughts-Agent v1 is the first TerminalBench agent trained on fully open curated SFT and RL environments. OpenThinker-Agent-v1 is the strongest model of its size on
5
13
112
@AlexGDimakis
Alex Dimakis
15 days
And this is how you present a poster. Masterful.
@anand_bhattad
Anand Bhattad
16 days
A better view and quality: https://t.co/czBeIJEqc5
0
3
19
@istoica05
Ion Stoica
6 months
Taking a step towards building a modular RL framework with our SkyRL project.
@NovaSkyAI
NovaSky
6 months
✨Release: We upgraded SkyRL into a highly-modular, performant RL framework for training LLMs. We prioritized modularity—easily prototype new algorithms, environments, and training logic with minimal overhead. 🧵👇 Blog: https://t.co/jDvM95F0Bq Code: https://t.co/CWlKue79JH
5
18
74
@AlexGDimakis
Alex Dimakis
16 days
Measuring agents in production: valuable information on agents from the trenches
@melissapan
Melissa Pan
16 days
Thrilled to release our new paper MAP: Measuring Agents in Production ⚙️🚀 2025 is the year of agents… but do they actually work in the real world? Is it just hype? A group of 25 researchers from Berkeley, Stanford, UIUC, IBM, and Intesa Sanpaolo investigated what makes agents
3
7
31