Sander Dieleman @sedielem X Profile

Sander Dieleman

@sedielem

Followers

64K

Following

12K

Media

106

Statuses

2K

Research Scientist at Google DeepMind (WaveNet, Imagen, Veo). I tweet about deep learning (research + software), music, generative models (personal account).

https://t.co/q0jnR3ocnV

London, England

Joined December 2014

Don't wanna be here? Send us removal request.

Sander Dieleman

@sedielem

7 months

New blog post: let's talk about latents! https://t.co/Ddh7tXH642

sander.ai

Latent representations for generative models.

31

201

1K

Sander Dieleman

@sedielem

9 hours

@PatrickPyn35903 @thjashin @ruqi_zhang Here's a thread by the lead author!

Patrick Pynadath

@PatrickPyn35903

1 day

Continuous diffusion dominates images but fails on discrete data—despite learning continuous gradients that should enable coordinated updates. "CANDI: Hybrid Discrete-Continuous Diffusion Models" explains why and how why hybrid diffusion fixes it! (1/8)

0

6

Sander Dieleman

@sedielem

9 hours

The rehabilitation of continuous diffusion for discrete data continues! Check out CANDI by @PatrickPyn35903 @thjashin @ruqi_zhang Their insightful analysis explains why continuous methods have fallen behind, and why self-conditioning is so important. https://t.co/Bqn8Zd7hRz

Sander Dieleman

@sedielem

19 days

In diffusion LMs, discrete methods have all but displaced continuous ones (🥲). Interesting new trend: why not both? Use continuous methods to make discrete diffusion better. Diffusion duality: https://t.co/KPO56vDygp CADD: https://t.co/CNOIWcUIMo CCDD:

1

11

60

Sander Dieleman

@sedielem

21 hours

Some prefer math and rigour, personally I like intuitive explanations. This monograph has plenty of both! I love how much time is spent linking different perspectives (variational, score-based, flow-based) together. Chapter 6 in particular is really great. Amazing effort! 👏

Chieh-Hsin (Jesse) Lai

@JCJesseLai

1 day

Tired to go back to the original papers again and again? Our monograph: a systematic and fundamental recipe you can rely on! 📘 We’re excited to release 《The Principles of Diffusion Models》— with @DrYangSong, @gimdong58085414, @mittu1204, and @StefanoErmon. It traces the core

4

20

276

Capital Research Center

@capitalresearch

3 days

The scariest stories this October aren’t fiction—they’re funded. Read the new Capital Research magazine issue on our website!

4

72

Google DeepMind

@GoogleDeepMind

15 days

Veo is getting a major upgrade. 🚀 We’re rolling out Veo 3.1, our updated video generation model, alongside improved creative controls for filmmakers, storytellers, and developers - many of them with audio. 🧵

123

427

2K

Floor Eijkelboom

@FEijkelboom

15 days

We asked the same question: how can we combine the strengths of continuous and discrete approaches? Similar to CDCD, in our work, Purrception, we extend Variational FM to model VQ latents through continuous-discrete transport for image generation :D 👉 https://t.co/KIog9mLNWb

Sander Dieleman

@sedielem

19 days

In diffusion LMs, discrete methods have all but displaced continuous ones (🥲). Interesting new trend: why not both? Use continuous methods to make discrete diffusion better. Diffusion duality: https://t.co/KPO56vDygp CADD: https://t.co/CNOIWcUIMo CCDD:

1

12

70

Sander Dieleman

@sedielem

16 days

In my blog post on latents for generative modelling, I pointed out that representation learning and reconstruction are two separate tasks (§6.3), which autoencoders try to solve simultaneously. Separating them makes sense. It opens up a lot of possibilities, as this work shows!

Saining Xie

@sainingxie

16 days

three years ago, DiT replaced the legacy unet with a transformer-based denoising backbone. we knew the bulky VAEs would be the next to go -- we just waited until we could do it right. today, we introduce Representation Autoencoders (RAE). >> Retire VAEs. Use RAEs. 👇(1/n)

9

23

350

Sander Dieleman

@sedielem

19 days

In diffusion LMs, discrete methods have all but displaced continuous ones (🥲). Interesting new trend: why not both? Use continuous methods to make discrete diffusion better. Diffusion duality: https://t.co/KPO56vDygp CADD: https://t.co/CNOIWcUIMo CCDD:

arxiv.org

Diffusion language models, especially masked discrete diffusion models, have achieved great success recently. While there are some theoretical and primary empirical results showing the advantages...

Sander Dieleman

@sedielem

2 months

New survey on diffusion language models: https://t.co/SHicf69gxV (via @NicolasPerezNi1). Covers pre/post-training, inference and multimodality, with very nice illustrations. I can't help but feel a bit wistful about the apparent extinction of the continuous approach after 2023🥲

9

74

426

Paul Vicol

@PaulVicol

1 month

🔥Veo 3 has emergent zero-shot learning and reasoning capabilities! This multitalented model can do a huge range of interesting tasks. It understands physical properties, can manipulate objects, and can even reason. Check out more examples in this thread!

Thomas Kipf

@tkipf

1 month

Veo is a more general reasoner than you might think. Check out this super cool paper on "Video models are zero-shot learners and reasoners" from my colleagues at @GoogleDeepMind.

4

23

166

CELSIUS Energy Drink

@CelsiusOfficial

2 months

Hydrate. Hustle. GO! CELSIUS HYDRATION - The ultimate hydration for every move. CELSIUS. LIVE. FIT. GO!

202

381

5K

Naina Raisinghani

@nainar92

1 month

5 billion nano 🍌 = 5 regular sized 🍌! Also TIL: A group of bananas is called a hand.

Josh Woodward

@joshwoodward

1 month

🍌 @GeminiApp just passed 5 billion images in less than a month. What a ride, still going! Latest trend: retro selfies of you holding a baby version of you. Can't make this stuff up!

0

3

10

Signal Processing

@DSP_fact

1 month

Cepstrum, quefrency, and pitch

johndcook.com

How cepstrum analysis can be used to locate pitch. Introduces cepstrum and quefrency and illustrates them by looking at a distorted electric guitar clip.

1

6

18

Sander Dieleman

@sedielem

2 months

The effective context length of Transformers with local (sliding window) attention layers is usually much shorter than the theoretical maximum. This blog post explains why. Back in 2017 the visualisations in https://t.co/JPLa3pyaON really changed my perspective on this for CNNs!

arxiv.org

We study characteristics of receptive fields of units in deep convolutional networks. The receptive field size is a crucial issue in many visual tasks, as the output must respond to large enough...

Guangxuan Xiao

@Guangxuan_Xiao

2 months

It's a common belief that L SWA layers (size W) yield an L×W receptive field. My post shows why the effective range is limited to O(W), regardless of depth. The reasons are information dilution and the exponential barrier from residual connections:

2

34

225

Sander Dieleman

@sedielem

2 months

Really great deep dive on sources of nondeterminism in LLM inference. Before reading, I also believed atomicAdd was to blame for all of it, but it seems like that's mostly a red herring nowadays!

Thinking Machines

@thinkymachines

2 months

Today Thinking Machines Lab is launching our research blog, Connectionism. Our first blog post is “Defeating Nondeterminism in LLM Inference” We believe that science is better when shared. Connectionism will cover topics as varied as our research is: from kernel numerics to

0

1

35

ML in PL

@MLinPL

2 months

We’re thrilled to welcome Sander Dieleman, Research Scientist at Google DeepMind, to ML in PL Conference 2025! Sander Dieleman is a Research Scientist at Google DeepMind in London, UK, where he has worked on the development of AlphaGo, WaveNet, Imagen 4, Veo 3, and more. He

1

4

32

Nikos Kolotouros

@nikoskolot

2 months

Our team at GDM is hiring! Consider applying if you’re excited to work on state-of-the-art media generation! https://t.co/zzPJfae49Q

job-boards.greenhouse.io

2

11

138

François Rozet

@FrancoisRozet

2 months

Does a smaller latent space lead to worse generation in latent diffusion models? Not necessarily! We show that LDMs are extremely robust to a wide range of compression rates (10-1000x) in the context of physics emulation. We got lost in latent space. Join us 👇

14

88

461

Josh Woodward

@joshwoodward

2 months

Our TPUs right now

69

67

1K

Aäron van den Oord

@avdnoord

2 months

You know the team has built a good model when ... people go to a website that randomly serves it some of the time 5 million times

lmarena.ai

@arena

2 months

🚨🍌Breaking News: Gemini-2.5-Flash-Image-Preview (“nano-banana”) by @GoogleDeepMind now ranks #1 in Image Edit Arena. In just two weeks: 🟡“nano-banana” has driven over 5 million community votes in the Arena 🟡Record-breaking 2.5M+ votes casted for this model alone 🟡It has

9

14

204

lmarena.ai

@arena

2 months

🚨🍌Breaking News: Gemini-2.5-Flash-Image-Preview (“nano-banana”) by @GoogleDeepMind now ranks #1 in Image Edit Arena. In just two weeks: 🟡“nano-banana” has driven over 5 million community votes in the Arena 🟡Record-breaking 2.5M+ votes casted for this model alone 🟡It has

Google DeepMind

@GoogleDeepMind

2 months

Image generation with Gemini just got a bananas upgrade and is the new state-of-the-art image generation and editing model. 🤯 From photorealistic masterpieces to mind-bending fantasy worlds, you can now natively produce, edit and refine visuals with new levels of reasoning,

36

158

1K

Sander Dieleman

@sedielem

2 months

🤏🍌

Google DeepMind

@GoogleDeepMind

2 months

Image generation with Gemini just got a bananas upgrade and is the new state-of-the-art image generation and editing model. 🤯 From photorealistic masterpieces to mind-bending fantasy worlds, you can now natively produce, edit and refine visuals with new levels of reasoning,

2

57

Demis Hassabis

@demishassabis

2 months

strange object spotted under the microscope over the weekend in the lab...

358

242

4K