CevherLIONS Profile Banner
Volkan Cevher Profile
Volkan Cevher

@CevherLIONS

Followers
3K
Following
5K
Media
102
Statuses
1K

Associate Professor of Electrical Engineering, EPFL. Amazon Scholar (AGI Foundations). IEEE Fellow. ELLIS Fellow.

Lausanne, Switzerland
Joined January 2013
Don't wanna be here? Send us removal request.
@CevherLIONS
Volkan Cevher
7 months
@caglarml and I are excited to share our lecture slides for EE-628 Training Large Language Models course: https://t.co/QGRbwg9MKL If you have any feedback, please reach out to us. I am also at #ICLR25.
Tweet card summary image
epfl.ch
Outline The 2025 course consists of the following topics:   Lecture 1 – Architectures Lecture 2 – Optimization and Hyperparameter Transfer Lecture 3 – Data Mixtures Lecture 4 – Fine Tuning Lecture 5...
1
8
81
@micahgoldblum
Micah Goldblum
4 days
An LLM-generated paper is in the top 17% of ICLR submissions in terms of average reviewer score, having received two 8's. The paper has tons of BS jargon and hallucinated references. Fortunately, one reviewer actually looked at the paper and gave it a zero. 1/3
39
136
1K
@peter_richtarik
Peter Richtarik
4 days
I am an AC for ICLR 2026. One of the papers in my batch was just withdrawn. The authors wrote a brief response, explaining why the reviewers failed at their job. I agree with most of their comments. The authors gave up. They are fed up. Just like many of us. I understand. We
32
202
1K
@tmpethick
Thomas Pethick
14 days
@_sdbuchanan @thinkymachines Interesting! If inexactness of the msign solver is a problem, a coarser approximation (+stopping criteria) should be possible by deriving an error corrected variant of ADMM as an instance iPPPA from https://t.co/73n1u1N8Ri. Tagging @leloykun since it should also apply to PDHG
1
1
5
@sdyevre
Une fille du lab ⚡️
22 days
C’est officiel : les #PGMODays2025 arrivent les 18-19 novembre à Paris-Saclay ! Deux jours où l’optimisation mathématique, l’IA, la data science se rencontrent pour trouver des solutions à des problèmes concrets mais très difficiles ! 👇👇👇 c’est à #EDFLab
1
3
6
@Fanghui_SgrA
Fanghui Liu
24 days
Check out @yuanhezhang6 's thread on our recent work exploring a new pipeline to model step-level reasoning, a “Goldilocks principle” that balances free-form CoT and formal systems! 👇🏾
@yuanhezhang6
Yuanhe Zhang
24 days
Introducing DAG-MATH, a new formatted CoT used for evaluating mathematical reasoning ability of LLMs Work w. Ilja Kuzborskij, @jasondeanlee, @CL_Theory, @Fanghui_SgrA Paper: https://t.co/zj1g3KcURl Code: https://t.co/fucdn4ptpc See more details👇
0
1
5
@hugo_larochelle
Hugo Larochelle
27 days
We at TMLR are proud to announce that selected papers will now be eligible for an opportunity to present at the joint NeurIPS/ICML/ICLR Journal-to-Conference (J2C) Track: https://t.co/CyjZtqbnBS
Tweet card summary image
medium.com
Great news! We’re excited to announce that selected papers published in the Transactions on Machine Learning Research (TMLR) will now be…
14
78
460
@Fanghui_SgrA
Fanghui Liu
25 days
🚀 We’re organizing @IJCV special issue on “Post-Training in LLMs for Computer Vision”! With guest editors @Grigoris_c @vidal_rene Dacheng Tao (NTU) and Philip Torr (Oxford) 📘 Learn more & submit your work: https://t.co/8ZQnRiFfFt 📅 Submission ddl: July 15, 2026
0
2
5
@docmilanfar
Peyman Milanfar
1 month
To establish power law behavior we need statistical tests. This paper is a nice overview of statistical methods for testing power laws "Power-Law Distributions in Empirical Data" by A Clauset, CR Shalizi, & MEJ Newman SIAM Review, 51(4), 661–703 https://t.co/jlYNEXrekT 4/5
2
6
82
@rshia_afz
Arshia Afzal
2 months
Check out our 📚 Paper: https://t.co/W2Jl094sUK 🌐 Blogpost: https://t.co/B8Tb1G7diF 𝕏 Thread: https://t.co/ZuvCOfyzSR Finally, huge thanks to @leylacandogan, @abad_rocamora, @polpuigdemont & @CevherLIONS, this wouldn’t have been possible without their support and help!
@CevherLIONS
Volkan Cevher
9 months
🚀 Curious about how Linear Transformers perform on bi-directional tasks and how to adapt them? Meet LION ( https://t.co/66yKkHD4R7) 🦁, our new framework for bi-directional sequence modeling that supports: + Full LInear AttentiON (LION) + Bi-directional RNN + Chunkwise Parallel
0
1
1
@CevherLIONS
Volkan Cevher
2 months
Highly recommended!
@ilijabogunovic
Ilija Bogunovic
2 months
Our new Rhine-AI lab is officially open at the University of Basel! We're currently recruiting for multiple PhD positions. If you're interested, you can register your interest on our new website, application links will be available soon: https://t.co/Nq5vazMxll
0
0
2
@gio_ramponi
Giorgia Ramponi
2 months
- Learning Equilibria from Data: Provably Efficient Multi-Agent Imitation Learning, @LucaViano4 @TFreihaut @GeistMatthieu @CevherLIONS - On Feasible Rewards in Multi-Agent Inverse Reinforcement Learning @TFreihaut
0
1
4
@tonysilveti
Tony S.F.
3 months
Marguerite Frank describing her memeory of inventing the Frank-Wolfe/Conditional Gradient algorithm together with Philip Wolfe. It seems she was the one to come up with the idea of the linear minimization oracle, with Wolfe contributing the convergence proof and presentation.
1
1
13
@_arohan_
rohan anil
4 months
Actually its’s even older! Spectral stochastic gradient descent from 2015!
1
1
25
@Fanghui_SgrA
Fanghui Liu
4 months
I will give the presentation today 4pm at #ICML2025 Oral session: Learning dynamics 2 @ West Ballroom B! Here is the poster and long-version slides ( https://t.co/CKFqWmMbvu) if you’re interested in.
@yuanhezhang6
Yuanhe Zhang
4 months
(1/n) 🚀Thrill to share our LoRA-One work ( https://t.co/3MrW0e0vii) as #ICML25 𝐨𝐫𝐚𝐥 𝐩𝐫𝐞𝐬𝐞𝐧𝐭𝐚𝐭𝐢𝐨𝐧, w. Fanghui @Fanghui_SgrA (Warwick) and Yudong (Madison). Oral @ West Ballroom B, 4pm at July 17th Poster @ West Exhibition Hall B2-B3 #W 905, 4:30PM at July 15th
0
2
12
@CevherLIONS
Volkan Cevher
4 months
Excited to give a tutorial with @leenaCvankadara on Training Neural Networks at Any Scale (TRAINS) @icmlconf at 13:30 (West Ballroom A). Our slides can be found here: https://t.co/FxWnkYXOrs Please join us.
3
11
80
@Cohere_Labs
Cohere Labs
5 months
Join our ML Theory group next week as they welcome @tonysilveti on July 3rd for a presentation on "Training neural networks at any scale" Thanks to @itsmaddox_j @aniervs and @ThangChu77 for organizing this session 👏 Learn more: https://t.co/WUFgriLKYZ
4
13
49
@Grigoris_c
Grigoris Chrysos
5 months
🚨 Panel on "how are theoretical tools useful in vision?" with an amazing list of panelists: @CevherLIONS @orussakovsky @vidal_rene Open to your questions, the more ambitious the better. In @CVPR : Room 107 A at 12 🎸.
0
2
6
@arthurmensch
Arthur Mensch
5 months
We purposely made it great at optimal transport as you may have guessed !
@LenaicChizat
Lénaïc Chizat
5 months
Just tested this model on a few challenging math questions and I found it very helpful. Magistral keeps doubting its answers ("wait, but...") & trying to improve them, which makes it great at exploring & exploiting knowledge from its train data (and it's fast). Congrats Mistral !
2
10
126
@YouJiacheng
You Jiacheng
5 months
If you cite Muon, I think you should definitely cite SSD ( https://t.co/h7sbAka0Wq) by @CevherLIONS et al. (sorry I can't find the handle of other authors) -- which proposed spectral descent.
1
17
151
@LucaViano4
Luca Viano
6 months
Finally, we have expert sample complexity bounds in multi agent imitation learning! https://t.co/DZ6JYau5L3 Joint work with @TFreihaut, @CevherLIONS, Matthieu and @gio_ramponi
2
4
31