
Jacques
@JacquesThibs
Followers
4K
Following
23K
Media
2K
Statuses
15K
Automating R&D safely and securing the future. šØš¦ Building something new.
San Francisco, CA
Joined May 2008
Most important part of the IMO Gold achievement. Were you surprised by this? Did you not update all the way to avoid likelihood of surprise?.
So whatās different? We developed new techniques that make LLMs a lot better at hard-to-verify tasks. IMO problems were the perfect challenge for this: proofs are pages long and take experts hours to grade. Compare that to AIME, where answers are simply an integer from 0 to 999.
3
1
14
Great thread from someone who got a *speedup* from using AIs in the METR study. He points to many things I try to reinforce in people when giving a workshop on using AIs in research.
I was one of the 16 devs in this study. I wanted to speak on my opinions about the causes and mitigation strategies for dev slowdown. I'll say as a "why listen to you?" hook that I experienced a -38% AI-speedup on my assigned issues. I think transparency helps the community.
0
0
0
When you try to actually build with AI, I think you also gain an understanding of AI capabilities/progress that you might miss if you only read AI safety literature
I think you gain a special level of insight for how fast the world is going to change if you are actively trying to leverage AI to build new businesses as fast as possible.
1
0
3