ssahoo_ Profile Banner
Subham Sahoo Profile
Subham Sahoo

@ssahoo_

Followers
2K
Following
883
Media
30
Statuses
432

Pioneering Diffusion LLMs. @cornell PhD. Previously: @GoogleAI; @IITKgp.

New York, USA
Joined June 2010
Don't wanna be here? Send us removal request.
@ssahoo_
Subham Sahoo
5 months
🚨 “The Diffusion Duality” is out! @ICML2025 ⚡️ Few-step generation in discrete diffusion language models by exploiting the underlying Gaussian diffusion. 🦾Beats AR on 3/7 zero-shot likelihood benchmarks. 📄 Paper: https://t.co/0RKsd8NJfB 💻 Code: https://t.co/oYE9hDYrGI 🧠
16
102
542
@ssahoo_
Subham Sahoo
15 hours
Please fill out your availability for the reading group
@diffusion_llms
Discrete Diffusion Reading Group
15 hours
As we get started with our discrete diffusion reading group, we’d like to schedule a recurring one-hour meeting time that works for everyone. Form:  https://t.co/B4PiXvKbkj > Please fill out your availability in the Google form , and be sure to select your local timezone when
1
0
13
@ssahoo_
Subham Sahoo
3 days
The term AGI gives me the same ick that “AI” did back in 2015. If it takes hundreds of billions of tokens just to get a respectable score on grade school math (GSM8K), that says everything about where we actually are.
0
1
15
@ssahoo_
Subham Sahoo
4 days
We’re building a space that connects researchers, students, and practitioners working on discrete diffusion. Join the Discord — collaborate, learn, and share! Whether you’re 💼hiring or showcasing your work, this is the place 👇 Discord:
discord.com
Check out the Discrete Diffusion Reading Group community on Discord - hang out with 19 other members and enjoy free voice and text chat.
@diffusion_llms
Discrete Diffusion Reading Group
4 days
The Discrete Diffusion Reading Group is growing — 400+ members strong! We’ve launched a Discord for discussions, research ideas, help, and job opportunities. Join the conversation 👇 💬 https://t.co/qw6h26OGU5 📧 https://t.co/kV9efqB43W
0
8
107
@ssahoo_
Subham Sahoo
7 days
Overwhelmed by the number of Diffusion LLM papers? 🌊 Same here 😭 So I’m starting a Discrete Diffusion Reading Group (@diffusion_llms) with my favorite disciples @jdeschena and @zhihanyang_ ✨ We’ll cover everything—from theory to empirics, from language to molecules. Join
20
40
316
@diffusion_llms
Discrete Diffusion Reading Group
7 days
Drowning in the sea of Discrete Diffusion papers? 🌊 We got you. Join our Reading Group! From theory → empirics, and language → molecules — we’ll decode the chaos together 💫 Join the cult—uh, I mean community 😇 👉 Google Group:  https://t.co/kV9efqBBTu (1 / 2)
1
7
23
@ssahoo_
Subham Sahoo
10 days
🔥 Rethinking Reasoning (with Diffusion LLMs) This work changes how you think about reasoning in LLMs. 🤯 Turns out: you don’t need the full chain-of-thought — only a small subset of CoT tokens actually matter for the final answer. ❌ Autoregressive LLMs can’t exploit this
10
36
230
@zachary_horvitz
Zachary Horvitz
10 days
✨Masked Diffusion Language Models✨ are great for reasoning, but not just for the reasons you think! Fast parallel decoding? 🤔 Any-order decoding? 🤨 Plot twist: MDLMs offer A LOT MORE for inference and post-training! 🎢🧵
4
35
162
@ssahoo_
Subham Sahoo
17 days
Happy Diwali — from mine to yours ✨
0
0
11
@ssahoo_
Subham Sahoo
20 days
How do you even compute such probabilities?
@elonmusk
Elon Musk
20 days
My estimate of the probability of Grok 5 achieving AGI is now at 10% and rising
1
0
12
@ssahoo_
Subham Sahoo
21 days
Impressive work by @jdeschena ! They propose to replace the Encoder only denoising transformer with an Encoder-Decoder architecture which leads to faster training and inference of MDLM.
@jdeschena
Justin Deschenaux
21 days
📢 « Partition Generative Modeling (PGM): Masked Modeling without Masks » is out! 🚯 Masked diffusion models waste FLOPs processing countless mask tokens that carry no real information. ⚡We show how partitioning can replace masking, boosting throughput by >5.3x on text and up
1
4
52
@ssahoo_
Subham Sahoo
24 days
Funny enough, after we released MDLM last year, @srush_nlp came up with the exact same idea!
@zhuci19
Cai Zhou @EMNLP2025
24 days
(1/5) Beyond Next-Token Prediction, introducing Next Semantic Scale Prediction! Our @NeurIPSConf NeurIPS 2025 paper HDLM is out! Check out the new language modeling paradigm: Next Semantic Scale Prediction via Hierarchical Diffusion Language Models. It largely generalizes
1
0
18
@jdeschena
Justin Deschenaux
24 days
✨ Masked Generative Models (MGMs) are powerful and can generate tokens in parallel. They’ve driven impressive results across text and images and are increasingly competitive with autoregressive (AR) models. Thrilled to share our latest work to accelerate MGMs (1/12) 🧵
2
12
34
@ssahoo_
Subham Sahoo
27 days
We’re dropping “The Diffusion Duality, Chapter 2” soon! So, stay tuned 🤗
@sedielem
Sander Dieleman
27 days
In diffusion LMs, discrete methods have all but displaced continuous ones (🥲). Interesting new trend: why not both? Use continuous methods to make discrete diffusion better. Diffusion duality: https://t.co/KPO56vDygp CADD: https://t.co/CNOIWcUIMo CCDD:
0
7
81
@ssahoo_
Subham Sahoo
1 month
🎓 Officially a doctor now 😊!!! As a first-gen college kid, this moment means the world to me. Grateful beyond words to all my mentors who’ve guided me along the way — from @GMartius who first introduced me to research back in 2017, to @volokuleshov who sparked my love for
83
57
2K
@ssahoo_
Subham Sahoo
1 month
Happening tomorrow at 2:30pm ET / 11:30 am PT
@ssahoo_
Subham Sahoo
1 month
📢 Excited to defend my PhD thesis: "Foundations of Diffusion Language Models" 🎓✨ 📅 October 3 | 11:30 am PT / 2:30 pm ET 🔗Zoom: https://t.co/PgHvs4s5UT Topics covered: 1⃣ MDLM 2⃣The Diffusion Duality 3⃣Esoteric Language Models
2
0
21
@NiJinjie
Jinjie Ni
1 month
🍷Imagine you are the boss of Google DeepMind. To train the best diffusion language model in world within 1 year, using 800 TPU pods, which model size will you go for? 🐿️ We build Quokka to help you decide–the first-ever large-scale scaling law for DLMs. Interesting facts: 1.
6
58
287
@ssahoo_
Subham Sahoo
1 month
Eternally grateful to my committee members: @jwthickstun (Chair), @Jimantha , Bart Selman
0
0
1
@ssahoo_
Subham Sahoo
1 month
Esoteric Language Models: First paper to propose KV-caching for Masked diffusion models without compromising with parallel generation https://t.co/UKh6GsazIb
@ssahoo_
Subham Sahoo
5 months
🚨 [New paper alert] Esoteric Language Models (Eso-LMs) First Diffusion LM to support KV caching w/o compromising parallel generation. 🔥 Sets new SOTA on the sampling speed–quality Pareto frontier 🔥 🚀 65× faster than MDLM ⚡ 4× faster than Block Diffusion 📜 Paper:
1
0
0