Quentin Bertrand
@Qu3ntinB
Followers
1K
Following
999
Media
16
Statuses
1K
Researcher at @Inria, affiliated at @Mila_Quebec. Previously, postdoctoral researcher at @Mila_Quebec w/ @SimonLacosteJ and @gauthier_gidel.
Joined January 2021
We need innovative technical and societal solutions to mitigate AI risks. I believe liability insurance for AI developers could be an excellent market-based incentive to drive safety standards and accountability, and is an option worth considering. https://t.co/SXD1pRSSz1
ft.com
Turing Prize winner urges governments to require tech groups to cover catastrophic outcomes and fund safety research
5
5
30
We just shipped a major Mercury refresh. ⚡ Best-in-class quality at up to 10× lower latency. Still the only commercial diffusion LLM in the world. Try the new model.
Mercury is refreshed – with across-the-board improvements in coding, instruction following, math, and knowledge recall. Start building responsive, in-the-flow AI solutions! Read more: https://t.co/QyTVaHAIue
2
8
64
This Stanford professor just raised a $50M Seed and have built a 10x faster and 10x cheaper AI coding model with the performance of Gemini Flash / Haiku. Inception Labs’ Mercury model can implement games like Connect 4 from scratch in ~2s. The speed feels magical, like going
17
28
307
Being part of Mila means joining the world’s largest community of academic researchers in deep learning. Submit your supervision request now for the MSc or PhD for Fall 2026. https://t.co/r01eLcXtZw
0
4
8
Very grateful to @schmidtsciences for being awarded an #AI2050 senior fellowship. And honored to be part of this select 2025 cohort of 7 senior fellows. This award will support our work on a deeper scientific basis for understanding and improving how artificial intelligence
We're excited to welcome 28 new AI2050 Fellows! This 4th cohort of researchers are pursuing projects that include building AI scientists, designing trustworthy models, and improving biological and medical research, among other areas. https://t.co/8oY7xdhxvF
8
4
60
Why and how does gradient/matrix orthogonalization work in Muon for training #LLMs? We introduce an isotropic curvature model to explain it. Take-aways: 1. Orthogonalization is a good idea, "on the right track". 2. But it might not be optimal. [1/n]
3
14
127
Come do a PhD with me 😀! Promise of fun science and great coffee ☕
30
70
731
🌀New paper on the generation phases of Flow Matching https://t.co/tzG2kPVGsE Are FM & diffusion models nothing else than denoisers trained at every noise level? In theory yes, *if trained optimally*. But in practice, do all noise level matter equally?
6
101
643
🚨🚨 The common myth that value alignment happens at the preference optimization (RLHF) stage is incorrect and have mislead years of research 💣. Mehar did a meticulous job showing that LLMs acquire values during SFT, not during preference optimization. Your SFT is probably the
🚨How do LLMs acquire human values?🤔 We often point to preference optimization. However, in our new work, we trace how and when model values shift during post-training and uncover surprising dynamics. We ask: How do data, algorithms, and their interaction shape model values?🧵
2
3
20
Pushed a big update to LM-class (v2025.2) -- this second version makes a much more mature resource Many refinements of lecture slides + significant improvements to the assignments Many thanks to @ch272h @HuaYilun and @shankarpad8 for their work on the assignments
1
5
21
Full stack devs, SWEs, MLEs, forward deployed engineers, research engineers, applied scientists: we are hiring! Join us and tackle cutting-edge challenges including physical AI, time series, material sciences, cybersecurity and many more. Positions available in Paris, London,
jobs.lever.co
Job openings at Mistral AI
31
92
1K
I contemplated whether I should post this, because it seems kind of obvious. But it's often taken for granted, so we might underestimate the impact: e.g. these days, diffusion papers don't usually show samples without guidance anymore (figures from GLIDE https://t.co/2wOdFfRHCK)
Generative modelling used to be about capturing the training data distribution. Interestingly, this stopped being the case when we started actually using them🤔 We tweak temps, use classifier-free guidance and post-train to get a distribution better than the training data.
3
15
153
🚨In our NeurIPS paper, we bring encoder-decoders back.. for diffusion language models! ⚡️Encoder-decoders make diffusion sampling fast: a small (fast) decoder denoises tokens progressively and a large (slower) encoder represents clean context.
8
36
242
Recordings for the talk on "Tiny Recursive Models" are up. https://t.co/6iCRHcX4VL
I will give a presentation on "Tiny Recursion Models" tomorrow at 1pm in the Mila Agora (6650 Saint-Urbain, Montreal). Its open to everyone, feel free to come by!
14
86
651
Introducing Generalised Flow Maps 🎉 A stable, few-step generative model on Riemannian manifolds 🪩 📚 Read it at: https://t.co/iCTHedwCxf 💾 Code: https://t.co/MeukcthFN2
@msalbergo @nmboffi @mmbronstein @bose_joey
3
22
112
Preprint on using ChatGPT to resolve a 42-year-old open problem (point convergence of Nesterov’s accelerated gradient method) is out. Mathematical results are complete, though still need to expand the discussion of historical context & prior work. (1/2) https://t.co/Dmd9huMjXS
arxiv.org
The Nesterov accelerated gradient method, introduced in 1983, has been a cornerstone of optimization theory and practice. Yet the question of its point convergence had remained open. In this work,...
12
68
469
I am potentially recruiting a postdoctoral fellow through this program. If interested, name me as a mentor, and ping me to let me know that you are applying! The process includes some sort of interview, so I can try to squeeze a few of these in advance (it will help a lot!)
.@Cornell is recruiting for multiple postdoctoral positions in AI as part of two programs: Empire AI Fellows and Foundational AI Fellows. Positions are available in NYC and Ithaca. Deadline for full consideration is Nov 20, 2025! https://t.co/Cp5710BauU
1
7
18
I will post the recordings after, don't worry.
9
2
116
I will give a presentation on "Tiny Recursion Models" tomorrow at 1pm in the Mila Agora (6650 Saint-Urbain, Montreal). Its open to everyone, feel free to come by!
20
18
279