Sander Dieleman Profile
Sander Dieleman

@sedielem

Followers
64K
Following
12K
Media
106
Statuses
2K

Research Scientist at Google DeepMind (WaveNet, Imagen, Veo). I tweet about deep learning (research + software), music, generative models (personal account).

London, England
Joined December 2014
Don't wanna be here? Send us removal request.
@sedielem
Sander Dieleman
7 months
New blog post: let's talk about latents! https://t.co/Ddh7tXH642
Tweet card summary image
sander.ai
Latent representations for generative models.
31
201
1K
@sedielem
Sander Dieleman
9 hours
@PatrickPyn35903 @thjashin @ruqi_zhang Here's a thread by the lead author!
@PatrickPyn35903
Patrick Pynadath
1 day
Continuous diffusion dominates images but fails on discrete data—despite learning continuous gradients that should enable coordinated updates. "CANDI: Hybrid Discrete-Continuous Diffusion Models" explains why and how why hybrid diffusion fixes it! (1/8)
0
0
6
@sedielem
Sander Dieleman
9 hours
The rehabilitation of continuous diffusion for discrete data continues! Check out CANDI by @PatrickPyn35903 @thjashin @ruqi_zhang Their insightful analysis explains why continuous methods have fallen behind, and why self-conditioning is so important. https://t.co/Bqn8Zd7hRz
@sedielem
Sander Dieleman
19 days
In diffusion LMs, discrete methods have all but displaced continuous ones (🥲). Interesting new trend: why not both? Use continuous methods to make discrete diffusion better. Diffusion duality: https://t.co/KPO56vDygp CADD: https://t.co/CNOIWcUIMo CCDD:
1
11
60
@sedielem
Sander Dieleman
21 hours
Some prefer math and rigour, personally I like intuitive explanations. This monograph has plenty of both! I love how much time is spent linking different perspectives (variational, score-based, flow-based) together. Chapter 6 in particular is really great. Amazing effort! 👏
@JCJesseLai
Chieh-Hsin (Jesse) Lai
1 day
Tired to go back to the original papers again and again? Our monograph: a systematic and fundamental recipe you can rely on! 📘 We’re excited to release 《The Principles of Diffusion Models》— with @DrYangSong, @gimdong58085414, @mittu1204, and @StefanoErmon. It traces the core
4
20
276
@capitalresearch
Capital Research Center
3 days
The scariest stories this October aren’t fiction—they’re funded. Read the new Capital Research magazine issue on our website!
4
4
72
@GoogleDeepMind
Google DeepMind
15 days
Veo is getting a major upgrade. 🚀 We’re rolling out Veo 3.1, our updated video generation model, alongside improved creative controls for filmmakers, storytellers, and developers - many of them with audio. 🧵
123
427
2K
@FEijkelboom
Floor Eijkelboom
15 days
We asked the same question: how can we combine the strengths of continuous and discrete approaches? Similar to CDCD, in our work, Purrception, we extend Variational FM to model VQ latents through continuous-discrete transport for image generation :D 👉 https://t.co/KIog9mLNWb
@sedielem
Sander Dieleman
19 days
In diffusion LMs, discrete methods have all but displaced continuous ones (🥲). Interesting new trend: why not both? Use continuous methods to make discrete diffusion better. Diffusion duality: https://t.co/KPO56vDygp CADD: https://t.co/CNOIWcUIMo CCDD:
1
12
70
@sedielem
Sander Dieleman
16 days
In my blog post on latents for generative modelling, I pointed out that representation learning and reconstruction are two separate tasks (§6.3), which autoencoders try to solve simultaneously. Separating them makes sense. It opens up a lot of possibilities, as this work shows!
@sainingxie
Saining Xie
16 days
three years ago, DiT replaced the legacy unet with a transformer-based denoising backbone. we knew the bulky VAEs would be the next to go -- we just waited until we could do it right. today, we introduce Representation Autoencoders (RAE). >> Retire VAEs. Use RAEs. 👇(1/n)
9
23
350
@sedielem
Sander Dieleman
19 days
In diffusion LMs, discrete methods have all but displaced continuous ones (🥲). Interesting new trend: why not both? Use continuous methods to make discrete diffusion better. Diffusion duality: https://t.co/KPO56vDygp CADD: https://t.co/CNOIWcUIMo CCDD:
Tweet card summary image
arxiv.org
Diffusion language models, especially masked discrete diffusion models, have achieved great success recently. While there are some theoretical and primary empirical results showing the advantages...
@sedielem
Sander Dieleman
2 months
New survey on diffusion language models: https://t.co/SHicf69gxV (via @NicolasPerezNi1). Covers pre/post-training, inference and multimodality, with very nice illustrations. I can't help but feel a bit wistful about the apparent extinction of the continuous approach after 2023🥲
9
74
426
@PaulVicol
Paul Vicol
1 month
🔥Veo 3 has emergent zero-shot learning and reasoning capabilities! This multitalented model can do a huge range of interesting tasks. It understands physical properties, can manipulate objects, and can even reason. Check out more examples in this thread!
@tkipf
Thomas Kipf
1 month
Veo is a more general reasoner than you might think. Check out this super cool paper on "Video models are zero-shot learners and reasoners" from my colleagues at @GoogleDeepMind.
4
23
166
@CelsiusOfficial
CELSIUS Energy Drink
2 months
Hydrate. Hustle. GO! CELSIUS HYDRATION - The ultimate hydration for every move. CELSIUS. LIVE. FIT. GO!
202
381
5K
@nainar92
Naina Raisinghani
1 month
5 billion nano 🍌 = 5 regular sized 🍌! Also TIL: A group of bananas is called a hand.
@joshwoodward
Josh Woodward
1 month
🍌 @GeminiApp just passed 5 billion images in less than a month. What a ride, still going! Latest trend: retro selfies of you holding a baby version of you. Can't make this stuff up!
0
3
10
@sedielem
Sander Dieleman
2 months
The effective context length of Transformers with local (sliding window) attention layers is usually much shorter than the theoretical maximum. This blog post explains why. Back in 2017 the visualisations in https://t.co/JPLa3pyaON really changed my perspective on this for CNNs!
Tweet card summary image
arxiv.org
We study characteristics of receptive fields of units in deep convolutional networks. The receptive field size is a crucial issue in many visual tasks, as the output must respond to large enough...
@Guangxuan_Xiao
Guangxuan Xiao
2 months
It's a common belief that L SWA layers (size W) yield an L×W receptive field. My post shows why the effective range is limited to O(W), regardless of depth. The reasons are information dilution and the exponential barrier from residual connections:
2
34
225
@sedielem
Sander Dieleman
2 months
Really great deep dive on sources of nondeterminism in LLM inference. Before reading, I also believed atomicAdd was to blame for all of it, but it seems like that's mostly a red herring nowadays!
@thinkymachines
Thinking Machines
2 months
Today Thinking Machines Lab is launching our research blog, Connectionism. Our first blog post is “Defeating Nondeterminism in LLM Inference” We believe that science is better when shared. Connectionism will cover topics as varied as our research is: from kernel numerics to
0
1
35
@MLinPL
ML in PL
2 months
We’re thrilled to welcome Sander Dieleman, Research Scientist at Google DeepMind, to ML in PL Conference 2025! Sander Dieleman is a Research Scientist at Google DeepMind in London, UK, where he has worked on the development of AlphaGo, WaveNet, Imagen 4, Veo 3, and more. He
1
4
32
@nikoskolot
Nikos Kolotouros
2 months
Our team at GDM is hiring! Consider applying if you’re excited to work on state-of-the-art media generation! https://t.co/zzPJfae49Q
job-boards.greenhouse.io
2
11
138
@FrancoisRozet
François Rozet
2 months
Does a smaller latent space lead to worse generation in latent diffusion models? Not necessarily! We show that LDMs are extremely robust to a wide range of compression rates (10-1000x) in the context of physics emulation. We got lost in latent space. Join us 👇
14
88
461
@joshwoodward
Josh Woodward
2 months
Our TPUs right now
69
67
1K
@avdnoord
Aäron van den Oord
2 months
You know the team has built a good model when ... people go to a website that randomly serves it some of the time 5 million times
@arena
lmarena.ai
2 months
🚨🍌Breaking News: Gemini-2.5-Flash-Image-Preview (“nano-banana”) by @GoogleDeepMind now ranks #1 in Image Edit Arena. In just two weeks: 🟡“nano-banana” has driven over 5 million community votes in the Arena 🟡Record-breaking 2.5M+ votes casted for this model alone 🟡It has
9
14
204
@arena
lmarena.ai
2 months
🚨🍌Breaking News: Gemini-2.5-Flash-Image-Preview (“nano-banana”) by @GoogleDeepMind now ranks #1 in Image Edit Arena. In just two weeks: 🟡“nano-banana” has driven over 5 million community votes in the Arena 🟡Record-breaking 2.5M+ votes casted for this model alone 🟡It has
@GoogleDeepMind
Google DeepMind
2 months
Image generation with Gemini just got a bananas upgrade and is the new state-of-the-art image generation and editing model. 🤯 From photorealistic masterpieces to mind-bending fantasy worlds, you can now natively produce, edit and refine visuals with new levels of reasoning,
36
158
1K
@sedielem
Sander Dieleman
2 months
🤏🍌
@GoogleDeepMind
Google DeepMind
2 months
Image generation with Gemini just got a bananas upgrade and is the new state-of-the-art image generation and editing model. 🤯 From photorealistic masterpieces to mind-bending fantasy worlds, you can now natively produce, edit and refine visuals with new levels of reasoning,
2
2
57
@demishassabis
Demis Hassabis
2 months
strange object spotted under the microscope over the weekend in the lab...
358
242
4K