Volkan Cevher
@CevherLIONS
Followers
3K
Following
5K
Media
102
Statuses
1K
Associate Professor of Electrical Engineering, EPFL. Amazon Scholar (AGI Foundations). IEEE Fellow. ELLIS Fellow.
Lausanne, Switzerland
Joined January 2013
@caglarml and I are excited to share our lecture slides for EE-628 Training Large Language Models course: https://t.co/QGRbwg9MKL If you have any feedback, please reach out to us. I am also at #ICLR25.
epfl.ch
Outline The 2025 course consists of the following topics: Lecture 1 – Architectures Lecture 2 – Optimization and Hyperparameter Transfer Lecture 3 – Data Mixtures Lecture 4 – Fine Tuning Lecture 5...
1
8
81
An LLM-generated paper is in the top 17% of ICLR submissions in terms of average reviewer score, having received two 8's. The paper has tons of BS jargon and hallucinated references. Fortunately, one reviewer actually looked at the paper and gave it a zero. 1/3
39
136
1K
I am an AC for ICLR 2026. One of the papers in my batch was just withdrawn. The authors wrote a brief response, explaining why the reviewers failed at their job. I agree with most of their comments. The authors gave up. They are fed up. Just like many of us. I understand. We
32
202
1K
@_sdbuchanan @thinkymachines Interesting! If inexactness of the msign solver is a problem, a coarser approximation (+stopping criteria) should be possible by deriving an error corrected variant of ADMM as an instance iPPPA from https://t.co/73n1u1N8Ri. Tagging @leloykun since it should also apply to PDHG
1
1
5
C’est officiel : les #PGMODays2025 arrivent les 18-19 novembre à Paris-Saclay ! Deux jours où l’optimisation mathématique, l’IA, la data science se rencontrent pour trouver des solutions à des problèmes concrets mais très difficiles ! 👇👇👇 c’est à #EDFLab
1
3
6
Check out @yuanhezhang6 's thread on our recent work exploring a new pipeline to model step-level reasoning, a “Goldilocks principle” that balances free-form CoT and formal systems! 👇🏾
Introducing DAG-MATH, a new formatted CoT used for evaluating mathematical reasoning ability of LLMs Work w. Ilja Kuzborskij, @jasondeanlee, @CL_Theory, @Fanghui_SgrA Paper: https://t.co/zj1g3KcURl Code: https://t.co/fucdn4ptpc See more details👇
0
1
5
We at TMLR are proud to announce that selected papers will now be eligible for an opportunity to present at the joint NeurIPS/ICML/ICLR Journal-to-Conference (J2C) Track: https://t.co/CyjZtqbnBS
medium.com
Great news! We’re excited to announce that selected papers published in the Transactions on Machine Learning Research (TMLR) will now be…
14
78
460
🚀 We’re organizing @IJCV special issue on “Post-Training in LLMs for Computer Vision”! With guest editors @Grigoris_c @vidal_rene Dacheng Tao (NTU) and Philip Torr (Oxford) 📘 Learn more & submit your work: https://t.co/8ZQnRiFfFt 📅 Submission ddl: July 15, 2026
0
2
5
To establish power law behavior we need statistical tests. This paper is a nice overview of statistical methods for testing power laws "Power-Law Distributions in Empirical Data" by A Clauset, CR Shalizi, & MEJ Newman SIAM Review, 51(4), 661–703 https://t.co/jlYNEXrekT 4/5
2
6
82
Check out our 📚 Paper: https://t.co/W2Jl094sUK 🌐 Blogpost: https://t.co/B8Tb1G7diF 𝕏 Thread: https://t.co/ZuvCOfyzSR Finally, huge thanks to @leylacandogan, @abad_rocamora, @polpuigdemont & @CevherLIONS, this wouldn’t have been possible without their support and help!
🚀 Curious about how Linear Transformers perform on bi-directional tasks and how to adapt them? Meet LION ( https://t.co/66yKkHD4R7) 🦁, our new framework for bi-directional sequence modeling that supports: + Full LInear AttentiON (LION) + Bi-directional RNN + Chunkwise Parallel
0
1
1
Highly recommended!
Our new Rhine-AI lab is officially open at the University of Basel! We're currently recruiting for multiple PhD positions. If you're interested, you can register your interest on our new website, application links will be available soon: https://t.co/Nq5vazMxll
0
0
2
- Learning Equilibria from Data: Provably Efficient Multi-Agent Imitation Learning, @LucaViano4 @TFreihaut @GeistMatthieu @CevherLIONS - On Feasible Rewards in Multi-Agent Inverse Reinforcement Learning @TFreihaut
0
1
4
Marguerite Frank describing her memeory of inventing the Frank-Wolfe/Conditional Gradient algorithm together with Philip Wolfe. It seems she was the one to come up with the idea of the linear minimization oracle, with Wolfe contributing the convergence proof and presentation.
1
1
13
Actually its’s even older! Spectral stochastic gradient descent from 2015!
1
1
25
I will give the presentation today 4pm at #ICML2025 Oral session: Learning dynamics 2 @ West Ballroom B! Here is the poster and long-version slides ( https://t.co/CKFqWmMbvu) if you’re interested in.
(1/n) 🚀Thrill to share our LoRA-One work ( https://t.co/3MrW0e0vii) as #ICML25 𝐨𝐫𝐚𝐥 𝐩𝐫𝐞𝐬𝐞𝐧𝐭𝐚𝐭𝐢𝐨𝐧, w. Fanghui @Fanghui_SgrA (Warwick) and Yudong (Madison). Oral @ West Ballroom B, 4pm at July 17th Poster @ West Exhibition Hall B2-B3 #W 905, 4:30PM at July 15th
0
2
12
Excited to give a tutorial with @leenaCvankadara on Training Neural Networks at Any Scale (TRAINS) @icmlconf at 13:30 (West Ballroom A). Our slides can be found here: https://t.co/FxWnkYXOrs Please join us.
3
11
80
Join our ML Theory group next week as they welcome @tonysilveti on July 3rd for a presentation on "Training neural networks at any scale" Thanks to @itsmaddox_j @aniervs and @ThangChu77 for organizing this session 👏 Learn more: https://t.co/WUFgriLKYZ
4
13
49
🚨 Panel on "how are theoretical tools useful in vision?" with an amazing list of panelists: @CevherLIONS @orussakovsky @vidal_rene Open to your questions, the more ambitious the better. In @CVPR : Room 107 A at 12 🎸.
0
2
6
We purposely made it great at optimal transport as you may have guessed !
Just tested this model on a few challenging math questions and I found it very helpful. Magistral keeps doubting its answers ("wait, but...") & trying to improve them, which makes it great at exploring & exploiting knowledge from its train data (and it's fast). Congrats Mistral !
2
10
126
If you cite Muon, I think you should definitely cite SSD ( https://t.co/h7sbAka0Wq) by @CevherLIONS et al. (sorry I can't find the handle of other authors) -- which proposed spectral descent.
1
17
151
Finally, we have expert sample complexity bounds in multi agent imitation learning! https://t.co/DZ6JYau5L3 Joint work with @TFreihaut, @CevherLIONS, Matthieu and @gio_ramponi
2
4
31