
Louis Béthune
@LouisBAlgue
Followers
126
Following
200
Media
9
Statuses
64
Please constrain the Lipschitz constant of your networks.
Toulouse, France
Joined July 2020
We propose new scaling laws that predict the optimal data mixture, for pretraining LLMs, native multimodal models and large vision encoders ! Only running small-scale experiments is needed, and we can then extrapolate to large-scale ones. These laws allow 1/n 🧵
6
49
268
@AmitisShidani1 @samira_abnar @harshays_ @alaa_nouby @AggieInCA and Scaling Laws for Forgetting and Fine-Tuning (E-2708) with @LouisBAlgue, David Grangier, Eleonora Gualdoni, Marco Cuturi, and @PierreAblin 🔗 https://t.co/c8xqFTf3ZE
1
1
3
This paper maps hardware-cost sweet spots for training efficient small-scale language models. Data shows A100-40GB beats H100 for training cost-effective small language models 🎯 Original Problem: Training small-scale LLMs (under 2B parameters) faces unclear computational
4
3
18
Apple releases AIMv2 Multimodal Autoregressive Pre-training of Large Vision Encoders
4
85
552
🍏 Apple ML research in Paris has multiple open internship positions!🍎 We are looking for Ph.D. students interested in generative modeling, optimization, large-scale learning or uncertainty quantification, with applications to challenging scientific problems. Details below 👇
4
79
581
👋👨🍳🍵 After a year of cooking up a secret project, I'm thrilled to officially reveal: The 𝐋𝐄𝐍𝐒 𝐏𝐫𝐨𝐣𝐞𝐜𝐭. By combining modern tools of Explainable AI, how much can we explain a ResNet50? 🧶
11
67
270
📢 *PhD opening* at @inria_grenoble ! Edouard Pauwels, @vaiter and myself are looking for a student to work with us on learning theory for bilevel optimization, in particular, the implicit bias in bilevel optimization. If interested, please reach out!
1
36
68
If you're interested in a student researcher position at Google DeepMind in 2024, please apply here https://t.co/2qbncpDPW3 before December 15. My team will be looking for a student working on LLM finetuning on site in Paris.
7
52
269
Mathieu Serrurier, Franck Mamalet, @Napoolar, @ThibautBoissin, and myself will be there to present it in panel #1508 on Tuesday afternoon. Come see us to chat! 👋
0
0
3
Furthermore, the saliency maps are less noisy than the ones of conventional models, and importantly, more aligned with humans!
1
1
3
Our method, dubbed OTNN, trains a classifier with 1-Lipschitz neural networks and a loss inspired by optimal transport. We show that the classifier's gradient behaves like a Monge map. Super useful to generate counterfactual examples!
1
0
4
Interested in results at the intersection between explainability🔍 and optimal transport 🚚? Come check out "On the explainable properties of 1-Lipschitz Neural Networks: An Optimal Transport Perspective" on Tuesday at 5:15pm, panel #1508.
2
7
20
Complexity theory is underrated in deep learning, by the way.
0
0
1
LLMs need a self-modifying code. Scaling-up the context won't be enough. Continual learning + machine unlearning are necessary ingredients to true read/write operations.
1
0
3
👋 Explain big vision model with 𝐂𝐑𝐀𝐅𝐓 🪄🐰 A method that 𝙖𝙪𝙩𝙤𝙢𝙖𝙩𝙞𝙘𝙖𝙡𝙡𝙮 extracts the most important concepts for your favorite pre-trained vision model. e.g., we automatically discover the most important concepts on a ResNet50 for rabbits: eyes, ears, fur. 🧶
3
76
307
New work on Kernel regression on distributions https://t.co/4oJA2PU1JQ where we prove that Rate of convergence is faster ! Applications to forecast distributional variability of 2016 US presidential election. @ANITI_Toulouse @LouisBAlgue @FBachoc
arxiv.org
The distribution regression problem encompasses many important statistics and machine learning tasks, and arises in a large range of applications. Among various existing approaches to tackle this...
0
1
6
We are looking for a research engineer to work on domain adaptation and transfer learning @Polytechnique near Paris. Come with us to do research, open source Python software and benchmarks. Contact me by email if interested. Please RT (free users need to help each other).
2
50
69
I am at #ICML2023 to present my latest work. Is the human performance better than that of diffusion models on the one-shot drawings task ? Attend my oral presentation today to have the answer ! More details below : https://t.co/8C24jY16K3
Our article "Diffusion Models as Artist: Are we Closing the Gap between Humans and Machine" ( https://t.co/0qj3itzNrT) has been accepted at #icml2023 and selected as an oral 🎉🎊! 🤖 = 👨🏻🎨 ?? (1/5) 🧵
0
6
14
We're launching Keras Core, a new library that brings the Keras API to JAX and PyTorch in addition to TensorFlow. It enables you to write cross-framework deep learning components and to benefit from the best that each framework has to offer. Read more: https://t.co/xmmxBfSZgh
121
782
4K
I'm glad to share that our paper "COCKATIEL: COntinuous Concept ranKed ATtribution with Interpretable ELements for explaining neural net classifiers on NLP" ( https://t.co/0VgZWrdfP4) was accepted at Findings of #ACL2023 ! ❤️🦜 #ACL2023NLP #NLProc #XAI 1/6🧵
3
17
136