
Volkan Cevher
@CevherLIONS
Followers
3K
Following
5K
Media
102
Statuses
1K
Associate Professor of Electrical Engineering, EPFL. Amazon Scholar (AGI Foundations). IEEE Fellow. ELLIS Fellow.
Lausanne, Switzerland
Joined January 2013
@caglarml and I are excited to share our lecture slides for EE-628 Training Large Language Models course: If you have any feedback, please reach out to us. I am also at #ICLR25.
epfl.ch
Outline The 2025 course consists of the following topics: Lecture 1 – Architectures Lecture 2 – Optimization and Hyperparameter Transfer Lecture 3 – Data Mixtures Lecture 4 – Fine Tuning Lecture 5...
1
8
79
RT @tonysilveti: Marguerite Frank describing her memeory of inventing the Frank-Wolfe/Conditional Gradient algorithm together with Philip W….
0
1
0
RT @Fanghui_SgrA: I will give the presentation today 4pm at #ICML2025 Oral session: Learning dynamics 2 @ West Ballroom B!. Here is the pos….
0
2
0
Excited to give a tutorial with @leenaCvankadara on Training Neural Networks at Any Scale (TRAINS) @icmlconf at 13:30 (West Ballroom A). Our slides can be found here:.Please join us.
3
11
79
RT @Cohere_Labs: Join our ML Theory group next week as they welcome @tonysilveti on July 3rd for a presentation on "Training neural network….
0
13
0
RT @Grigoris_c: 🚨 Panel on "how are theoretical tools useful in vision?" with an amazing list of panelists: . @CevherLIONS @orussakovsky @….
0
2
0
RT @YouJiacheng: If you cite Muon, I think you should definitely cite SSD ( by @CevherLIONS et al. (sorry I can't f….
0
17
0
RT @LucaViano4: Finally, we have expert sample complexity bounds in multi agent imitation learning!. Joint work wi….
0
4
0
RT @MAstronomers: The Sun photographed for more than a year from the same spot at the same time ♾.
0
3K
0
RT @tonysilveti: A short and sweet proof of convergence of steepest descent w.r.t. an arbitrary norm in the nonconvex (but smooth) setting.….
0
6
0
RT @bremen79: I have an opening for a post-doc position: I am looking for smart people with a strong CV in optimization and/or online learn….
0
32
0
RT @LucaViano4: 1/n If you are developing a new IL algorithm that alternates between reward and SAC updates, read this new trick named SOAR….
arxiv.org
This paper introduces the SOAR framework for imitation learning. SOAR is an algorithmic template that learns a policy from expert demonstrations with a primal dual style algorithm that alternates...
0
3
0
RT @caglarml: @CevherLIONS and I have spent a lot of time preparing these course materials on the "foundations of training LLMs." Now, we a….
0
4
0
RT @tmpethick: We'll be presenting our work at #ICLR2025 today on "Efficient Interpolation between Extragradient and Proximal Methods for W….
0
2
0
I would like to especially acknowledge the incredible help of many talented people that made this happen: @WanyunXie, Yongtao Wu, Leyla Candogan, Mete Erdogan, Francesco Tonin, Frank Wu, Pol Puigdemont, Ioannis Mavrothalassitis, Zhenyu Zhu.
1
0
6
RT @bremen79: FYI if you suspect that a reviewer used an LLM to generate an ICML review, you can report them at this link: .
0
8
0