Subham Sahoo @ssahoo_ X Profile

Subham Sahoo

@ssahoo_

Followers

2K

Following

883

Media

30

Statuses

432

Pioneering Diffusion LLMs. @cornell PhD. Previously: @GoogleAI; @IITKgp.

https://t.co/P2tsnC04KY

New York, USA

Joined June 2010

Don't wanna be here? Send us removal request.

Subham Sahoo

@ssahoo_

5 months

🚨 “The Diffusion Duality” is out! @ICML2025 ⚡️ Few-step generation in discrete diffusion language models by exploiting the underlying Gaussian diffusion. 🦾Beats AR on 3/7 zero-shot likelihood benchmarks. 📄 Paper: https://t.co/0RKsd8NJfB 💻 Code: https://t.co/oYE9hDYrGI 🧠

16

102

542

Subham Sahoo

@ssahoo_

15 hours

Please fill out your availability for the reading group

Discrete Diffusion Reading Group

@diffusion_llms

15 hours

As we get started with our discrete diffusion reading group, we’d like to schedule a recurring one-hour meeting time that works for everyone. Form: https://t.co/B4PiXvKbkj > Please fill out your availability in the Google form , and be sure to select your local timezone when

1

0

13

Subham Sahoo

@ssahoo_

3 days

The term AGI gives me the same ick that “AI” did back in 2015. If it takes hundreds of billions of tokens just to get a respectable score on grade school math (GSM8K), that says everything about where we actually are.

0

1

15

Subham Sahoo

@ssahoo_

4 days

We’re building a space that connects researchers, students, and practitioners working on discrete diffusion. Join the Discord — collaborate, learn, and share! Whether you’re 💼hiring or showcasing your work, this is the place 👇 Discord:

discord.com

Check out the Discrete Diffusion Reading Group community on Discord - hang out with 19 other members and enjoy free voice and text chat.

Discrete Diffusion Reading Group

@diffusion_llms

4 days

The Discrete Diffusion Reading Group is growing — 400+ members strong! We’ve launched a Discord for discussions, research ideas, help, and job opportunities. Join the conversation 👇 💬 https://t.co/qw6h26OGU5 📧 https://t.co/kV9efqB43W

0

8

107

Subham Sahoo

@ssahoo_

7 days

Overwhelmed by the number of Diffusion LLM papers? 🌊 Same here 😭 So I’m starting a Discrete Diffusion Reading Group (@diffusion_llms) with my favorite disciples @jdeschena and @zhihanyang_ ✨ We’ll cover everything—from theory to empirics, from language to molecules. Join

20

40

316

Discrete Diffusion Reading Group

@diffusion_llms

7 days

Drowning in the sea of Discrete Diffusion papers? 🌊 We got you. Join our Reading Group! From theory → empirics, and language → molecules — we’ll decode the chaos together 💫 Join the cult—uh, I mean community 😇 👉 Google Group: https://t.co/kV9efqBBTu (1 / 2)

1

7

23

Subham Sahoo

@ssahoo_

10 days

paper: https://t.co/0uewQnoXP4 code:

github.com

Contribute to rajesh-lab/ReasoningDiffusionModels development by creating an account on GitHub.

0

7

Subham Sahoo

@ssahoo_

10 days

🔥 Rethinking Reasoning (with Diffusion LLMs) This work changes how you think about reasoning in LLMs. 🤯 Turns out: you don’t need the full chain-of-thought — only a small subset of CoT tokens actually matter for the final answer. ❌ Autoregressive LLMs can’t exploit this

10

36

230

Zachary Horvitz

@zachary_horvitz

10 days

✨Masked Diffusion Language Models✨ are great for reasoning, but not just for the reasons you think! Fast parallel decoding? 🤔 Any-order decoding? 🤨 Plot twist: MDLMs offer A LOT MORE for inference and post-training! 🎢🧵

4

35

162

Subham Sahoo

@ssahoo_

17 days

Happy Diwali — from mine to yours ✨

0

11

Subham Sahoo

@ssahoo_

20 days

How do you even compute such probabilities?

Elon Musk

@elonmusk

20 days

My estimate of the probability of Grok 5 achieving AGI is now at 10% and rising

1

0

12

Subham Sahoo

@ssahoo_

21 days

Impressive work by @jdeschena ! They propose to replace the Encoder only denoising transformer with an Encoder-Decoder architecture which leads to faster training and inference of MDLM.

Justin Deschenaux

@jdeschena

21 days

📢 « Partition Generative Modeling (PGM): Masked Modeling without Masks » is out! 🚯 Masked diffusion models waste FLOPs processing countless mask tokens that carry no real information. ⚡We show how partitioning can replace masking, boosting throughput by >5.3x on text and up

1

4

52

Subham Sahoo

@ssahoo_

24 days

Funny enough, after we released MDLM last year, @srush_nlp came up with the exact same idea!

Cai Zhou @EMNLP2025

@zhuci19

24 days

(1/5) Beyond Next-Token Prediction, introducing Next Semantic Scale Prediction! Our @NeurIPSConf NeurIPS 2025 paper HDLM is out! Check out the new language modeling paradigm: Next Semantic Scale Prediction via Hierarchical Diffusion Language Models. It largely generalizes

1

0

18

Justin Deschenaux

@jdeschena

24 days

✨ Masked Generative Models (MGMs) are powerful and can generate tokens in parallel. They’ve driven impressive results across text and images and are increasingly competitive with autoregressive (AR) models. Thrilled to share our latest work to accelerate MGMs (1/12) 🧵

2

12

34

Subham Sahoo

@ssahoo_

27 days

We’re dropping “The Diffusion Duality, Chapter 2” soon! So, stay tuned 🤗

Sander Dieleman

@sedielem

27 days

In diffusion LMs, discrete methods have all but displaced continuous ones (🥲). Interesting new trend: why not both? Use continuous methods to make discrete diffusion better. Diffusion duality: https://t.co/KPO56vDygp CADD: https://t.co/CNOIWcUIMo CCDD:

0

7

81

Subham Sahoo

@ssahoo_

1 month

🎓 Officially a doctor now 😊!!! As a first-gen college kid, this moment means the world to me. Grateful beyond words to all my mentors who’ve guided me along the way — from @GMartius who first introduced me to research back in 2017, to @volokuleshov who sparked my love for

83

57

2K

Subham Sahoo

@ssahoo_

1 month

Happening tomorrow at 2:30pm ET / 11:30 am PT

Subham Sahoo

@ssahoo_

1 month

📢 Excited to defend my PhD thesis: "Foundations of Diffusion Language Models" 🎓✨ 📅 October 3 | 11:30 am PT / 2:30 pm ET 🔗Zoom: https://t.co/PgHvs4s5UT Topics covered: 1⃣ MDLM 2⃣The Diffusion Duality 3⃣Esoteric Language Models

2

0

21

Jinjie Ni

@NiJinjie

1 month

🍷Imagine you are the boss of Google DeepMind. To train the best diffusion language model in world within 1 year, using 800 TPU pods, which model size will you go for? 🐿️ We build Quokka to help you decide–the first-ever large-scale scaling law for DLMs. Interesting facts: 1.

6

58

287

Subham Sahoo

@ssahoo_

1 month

Eternally grateful to my committee members: @jwthickstun (Chair), @Jimantha , Bart Selman

0

1

Subham Sahoo

@ssahoo_

1 month

Esoteric Language Models: First paper to propose KV-caching for Masked diffusion models without compromising with parallel generation https://t.co/UKh6GsazIb

Subham Sahoo

@ssahoo_

5 months

🚨 [New paper alert] Esoteric Language Models (Eso-LMs) First Diffusion LM to support KV caching w/o compromising parallel generation. 🔥 Sets new SOTA on the sampling speed–quality Pareto frontier 🔥 🚀 65× faster than MDLM ⚡ 4× faster than Block Diffusion 📜 Paper:

1

0