friederrrr Profile Banner
Simon Frieder Profile
Simon Frieder

@friederrrr

Followers
143
Following
38
Media
24
Statuses
117

Making the hills LLMs can climb towards becoming Math Copilots. https://t.co/ir85qxw65J (Opinions my own.)

Joined January 2023
Don't wanna be here? Send us removal request.
@friederrrr
Simon Frieder
5 days
IMO2025 has begun. Last year, AlphaProof won a silver medal (though no paper nor software was released so we have to trust that claim and the mathematicians that had access). This year, a whole bunch of organizations requested access to the IMO problems, so it will be.
1
1
13
@friederrrr
Simon Frieder
11 days
Benchmarks are object with structure. In particular, they have a "width": the number of tasks they benchmark. Most benchmarks today are narrow.
0
0
1
@friederrrr
Simon Frieder
16 days
4/n Hence I think ICML should update their voting procedure and require separate information from candidates; one box for CV/academic achievements; one box for part NeurIPS involvement; and one box, the _most important one_, for their future vision for NeurIPS, problems they want.
0
0
1
@friederrrr
Simon Frieder
16 days
3/n During my work as AIMO Prize Manager I could clearly observe that academic achievements (that many of the non-four candidates place emphasis on) actually correlate weakly with organizational experience needed to make AIMO2 a success (which it was, with thousands of people.
1
0
1
@friederrrr
Simon Frieder
16 days
2/n I cast my votes for all of these four candidates plus one mystery candidate. For these four at least I know roughly what seems to preoccupy them w.r.t. their tenure, should they get elected. Knowing that other candidates had n papers accepted at past conferences, or worked.
1
0
0
@friederrrr
Simon Frieder
16 days
1/n ICML calls all its members to vote. I just submitted my top-5 choices. I was struck how many emphasize their CV, and how few (only four candidates!) emphasize what they would improve during their tenure, or what issues they would solve.
1
0
1
@friederrrr
Simon Frieder
28 days
2/2 . and you can just use GeoGebra to draw a figure (pic courtesy of Tian), state the goal, and have Newclid grind the solution out. :)
Tweet media one
0
0
0
@friederrrr
Simon Frieder
28 days
1/2 Newclid is currently the premier *open-source* solver, which supersedes AlphaGeometry, fixes many of it's bugs and slightly improves performance. Full code:. There is AlphaGeometry2 and TongGeometry, but neither seem to have open source code available.
2
0
5
@friederrrr
Simon Frieder
2 months
I'm speaking today about Math Copilots at the "Mathematics for and by LLMs - 2025 Edition" event at the IHES in Bures-sur-Yvette, and will conclude the morning session. The other speakers form a great line-up, check it out if you're around.
0
0
4
@friederrrr
Simon Frieder
2 months
After elections in several European countries, I'm very happy to say that in Romania there is now a double IMO gold medalist as president, who won with perfect scores both times. (the 1987 link, he also won in 1988). I don't like to post about politics,.
0
0
3
@friederrrr
Simon Frieder
2 months
4/n .AlphaEvolve seems to be an incremental improvement over previous works in this space, including the "FunSearch" and "AlphaTensor" papers by DeepMind. As such, when one is dealing with a specific problem --from mathematics, computer science or a related field-- that is.
0
0
1
@friederrrr
Simon Frieder
2 months
3/n DeepMind, even though they make sure all their releases are interesting scientifically, has a slightly spotty history of releasing full public code. For example, AlphaFold2 was released, but without training scripts. AlphaGeometry turned out to contain bugs. In both cases,.
1
0
2
@friederrrr
Simon Frieder
2 months
2/n This means that, as a tool, it's not something that is etched in stone, and it may change at any point in time, at the whims of DeepMind. Which makes me reluctant to include it in my own tech stack when doing mathematics; I prefer that my workflows not change too often. (Yes,.
1
0
1
@friederrrr
Simon Frieder
2 months
AlphaEvolve has just been released - well, a white paper of sorts has been released. I was interviewed by Nature, and some preliminary opinions are here: (I recommend reading it to see what other researchers say). 1/n My own full deep dive into.
1
2
10
@friederrrr
Simon Frieder
2 months
Climbers' (models') and hills' (benchmarks') popularity evolves on different timescales. Climbers always seem to be getting the attention of the moment (and there are a lot of interesting LLMs released these days, Qwen3 being the most recent one). But hills seems to stand the.
1
0
4
@friederrrr
Simon Frieder
3 months
The anatomy of a successful AIMO2 model is: some SFT first, then clever HW optimizations to get to model to run within the allocated time on the 4xL4 that Kaggle offered for AIMO2, and lastly some light time management to assess when your model has answered a question correctly.
1
0
11
@friederrrr
Simon Frieder
3 months
The nature of mathematics is changing, as mathematics is slowly being transformed from a sole practice of the human mind to something more akin like coding, where you need a physical machine to do math. For some domains of applied math (e.g. numerics of ODEs/PDEs) that was true.
0
1
3
@friederrrr
Simon Frieder
3 months
Numina won AIMO1. NemoSkills tops AIMO2. N. AIMO3?.
1
1
13