
Fabian Schaipp
@FSchaipp
Followers
1K
Following
1K
Media
70
Statuses
419
working on optimization for machine learning. currently postdoc @inria_paris. sbatch and apero.
Paris, France
Joined July 2020
Learning rate schedules seem mysterious?.Turns out that their behaviour can be described with a bound from *convex, nonsmooth* optimization. Short thread on our latest paper ๐.
The sudden loss drop when annealing the learning rate at the end of a WSD (warmup-stable-decay) schedule can be explained without relying on non-convexity or even smoothness, a new paper shows that it can be precisely predicted by theory in the convex, non-smooth setting!.1/2
5
26
131
RT @mblondel_ml: We uploaded V3 of our draft book "The Elements of Differentiable Programming". Lots of typo fixes, clarity improvements, nโฆ.
0
75
0
on a more serious note: thanks to @fpedregosa and colleagues for this benchmark. happy to see MoMo works reasonably well out of the box on problems we never tested it on.
0
0
5
stand up for a clean references.bib!. if you want all papers from @NeurIPSConf, @icmlconf and @iclr_conf in one single bib file, this is for you. Just updated with ICLR 2025 proceedings ๐.
Want all NeurIPS/ICML/ICLR papers in one single .bib file? Here you go!. ๐๏ธ short blog: ๐ bib files:
1
1
9