Volkan Cevher @CevherLIONS X Profile

Volkan Cevher

@CevherLIONS

Followers

3K

Following

5K

Media

102

Statuses

1K

Associate Professor of Electrical Engineering, EPFL. Amazon Scholar (AGI Foundations). IEEE Fellow. ELLIS Fellow.

https://t.co/nvUh5yZwYw

Lausanne, Switzerland

Joined January 2013

Don't wanna be here? Send us removal request.

Volkan Cevher

@CevherLIONS

7 months

@caglarml and I are excited to share our lecture slides for EE-628 Training Large Language Models course: https://t.co/QGRbwg9MKL If you have any feedback, please reach out to us. I am also at #ICLR25.

epfl.ch

Outline The 2025 course consists of the following topics: Lecture 1 – Architectures Lecture 2 – Optimization and Hyperparameter Transfer Lecture 3 – Data Mixtures Lecture 4 – Fine Tuning Lecture 5...

1

8

81

Micah Goldblum

@micahgoldblum

4 days

An LLM-generated paper is in the top 17% of ICLR submissions in terms of average reviewer score, having received two 8's. The paper has tons of BS jargon and hallucinated references. Fortunately, one reviewer actually looked at the paper and gave it a zero. 1/3

39

136

1K

Peter Richtarik

@peter_richtarik

4 days

I am an AC for ICLR 2026. One of the papers in my batch was just withdrawn. The authors wrote a brief response, explaining why the reviewers failed at their job. I agree with most of their comments. The authors gave up. They are fed up. Just like many of us. I understand. We

32

202

1K

Thomas Pethick

@tmpethick

14 days

@_sdbuchanan @thinkymachines Interesting! If inexactness of the msign solver is a problem, a coarser approximation (+stopping criteria) should be possible by deriving an error corrected variant of ADMM as an instance iPPPA from https://t.co/73n1u1N8Ri. Tagging @leloykun since it should also apply to PDHG

1

5

Une fille du lab ⚡️

@sdyevre

22 days

C’est officiel : les #PGMODays2025 arrivent les 18-19 novembre à Paris-Saclay ! Deux jours où l’optimisation mathématique, l’IA, la data science se rencontrent pour trouver des solutions à des problèmes concrets mais très difficiles ! 👇👇👇 c’est à #EDFLab

1

3

6

Fanghui Liu

@Fanghui_SgrA

24 days

Check out @yuanhezhang6 's thread on our recent work exploring a new pipeline to model step-level reasoning, a “Goldilocks principle” that balances free-form CoT and formal systems! 👇🏾

Yuanhe Zhang

@yuanhezhang6

24 days

Introducing DAG-MATH, a new formatted CoT used for evaluating mathematical reasoning ability of LLMs Work w. Ilja Kuzborskij, @jasondeanlee, @CL_Theory, @Fanghui_SgrA Paper: https://t.co/zj1g3KcURl Code: https://t.co/fucdn4ptpc See more details👇

0

1

5

Hugo Larochelle

@hugo_larochelle

27 days

We at TMLR are proud to announce that selected papers will now be eligible for an opportunity to present at the joint NeurIPS/ICML/ICLR Journal-to-Conference (J2C) Track: https://t.co/CyjZtqbnBS

medium.com

Great news! We’re excited to announce that selected papers published in the Transactions on Machine Learning Research (TMLR) will now be…

14

78

460

Fanghui Liu

@Fanghui_SgrA

25 days

🚀 We’re organizing @IJCV special issue on “Post-Training in LLMs for Computer Vision”! With guest editors @Grigoris_c @vidal_rene Dacheng Tao (NTU) and Philip Torr (Oxford) 📘 Learn more & submit your work: https://t.co/8ZQnRiFfFt 📅 Submission ddl: July 15, 2026

0

2

5

Peyman Milanfar

@docmilanfar

1 month

To establish power law behavior we need statistical tests. This paper is a nice overview of statistical methods for testing power laws "Power-Law Distributions in Empirical Data" by A Clauset, CR Shalizi, & MEJ Newman SIAM Review, 51(4), 661–703 https://t.co/jlYNEXrekT 4/5

2

6

82

Arshia Afzal

@rshia_afz

2 months

Check out our 📚 Paper: https://t.co/W2Jl094sUK 🌐 Blogpost: https://t.co/B8Tb1G7diF 𝕏 Thread: https://t.co/ZuvCOfyzSR Finally, huge thanks to @leylacandogan, @abad_rocamora, @polpuigdemont & @CevherLIONS, this wouldn’t have been possible without their support and help!

Volkan Cevher

@CevherLIONS

9 months

🚀 Curious about how Linear Transformers perform on bi-directional tasks and how to adapt them? Meet LION ( https://t.co/66yKkHD4R7) 🦁, our new framework for bi-directional sequence modeling that supports: + Full LInear AttentiON (LION) + Bi-directional RNN + Chunkwise Parallel

0

1

Volkan Cevher

@CevherLIONS

2 months

Highly recommended!

Ilija Bogunovic

@ilijabogunovic

2 months

Our new Rhine-AI lab is officially open at the University of Basel! We're currently recruiting for multiple PhD positions. If you're interested, you can register your interest on our new website, application links will be available soon: https://t.co/Nq5vazMxll

0

2

Giorgia Ramponi

@gio_ramponi

2 months

- Learning Equilibria from Data: Provably Efficient Multi-Agent Imitation Learning, @LucaViano4 @TFreihaut @GeistMatthieu @CevherLIONS - On Feasible Rewards in Multi-Agent Inverse Reinforcement Learning @TFreihaut

0

1

4

Tony S.F.

@tonysilveti

3 months

Marguerite Frank describing her memeory of inventing the Frank-Wolfe/Conditional Gradient algorithm together with Philip Wolfe. It seems she was the one to come up with the idea of the linear minimization oracle, with Wolfe contributing the convergence proof and presentation.

1

13

rohan anil

@_arohan_

4 months

Actually its’s even older! Spectral stochastic gradient descent from 2015!

1

25

Fanghui Liu

@Fanghui_SgrA

4 months

I will give the presentation today 4pm at #ICML2025 Oral session: Learning dynamics 2 @ West Ballroom B! Here is the poster and long-version slides ( https://t.co/CKFqWmMbvu) if you’re interested in.

Yuanhe Zhang

@yuanhezhang6

4 months

(1/n) 🚀Thrill to share our LoRA-One work ( https://t.co/3MrW0e0vii) as #ICML25 𝐨𝐫𝐚𝐥 𝐩𝐫𝐞𝐬𝐞𝐧𝐭𝐚𝐭𝐢𝐨𝐧, w. Fanghui @Fanghui_SgrA (Warwick) and Yudong (Madison). Oral @ West Ballroom B, 4pm at July 17th Poster @ West Exhibition Hall B2-B3 #W 905, 4:30PM at July 15th

0

2

12

Volkan Cevher

@CevherLIONS

4 months

Excited to give a tutorial with @leenaCvankadara on Training Neural Networks at Any Scale (TRAINS) @icmlconf at 13:30 (West Ballroom A). Our slides can be found here: https://t.co/FxWnkYXOrs Please join us.

3

11

80

Cohere Labs

@Cohere_Labs

5 months

Join our ML Theory group next week as they welcome @tonysilveti on July 3rd for a presentation on "Training neural networks at any scale" Thanks to @itsmaddox_j @aniervs and @ThangChu77 for organizing this session 👏 Learn more: https://t.co/WUFgriLKYZ

4

13

49

Grigoris Chrysos

@Grigoris_c

5 months

🚨 Panel on "how are theoretical tools useful in vision?" with an amazing list of panelists: @CevherLIONS @orussakovsky @vidal_rene Open to your questions, the more ambitious the better. In @CVPR : Room 107 A at 12 🎸.

0

2

6

Arthur Mensch

@arthurmensch

5 months

We purposely made it great at optimal transport as you may have guessed !

Lénaïc Chizat

@LenaicChizat

5 months

Just tested this model on a few challenging math questions and I found it very helpful. Magistral keeps doubting its answers ("wait, but...") & trying to improve them, which makes it great at exploring & exploiting knowledge from its train data (and it's fast). Congrats Mistral !

2

10

126

You Jiacheng

@YouJiacheng

5 months

If you cite Muon, I think you should definitely cite SSD ( https://t.co/h7sbAka0Wq) by @CevherLIONS et al. (sorry I can't find the handle of other authors) -- which proposed spectral descent.

1

17

151

Luca Viano

@LucaViano4

6 months

Finally, we have expert sample complexity bounds in multi agent imitation learning! https://t.co/DZ6JYau5L3 Joint work with @TFreihaut, @CevherLIONS, Matthieu and @gio_ramponi

2

4

31