
Justin Deschenaux
@jdeschena
Followers
422
Following
4K
Media
35
Statuses
367
PhD student @EPFL advised by @caglarml. Working on diffusion language models ⚡️
Suisse
Joined May 2013
🌟 Excited to share our latest work on making diffusion language models (DLMs) faster than autoregressive (AR) models! ⚡ It’s been great to work on this with @caglarml 😎 Lately, DLMs are gaining traction as a promising alternative to autoregressive sequence modeling 👀 1/14 🧵
2
59
265
RT @iScienceLuvr: Inverse Scaling in Test-Time Compute. "We identify five distinct failure modes when models reason for longer: 1) Claude m….
0
35
0
RT @XiuyingWei966: If you’re interested in long-context efficiency, don’t miss our recent paper RAT—a joint effort with @anunay_yadav, Razv….
0
3
0
RT @caglarml: Many people still talk about coming up with alternatives to self-attention, but acknowledging the strengths of both self-atte….
0
3
0
RT @jdeschena: 🔥 NEW PAPER: "The Diffusion Duality". Uniform-state diffusion models for text generation emerges from an underlying continuo….
0
7
0
RT @OHilliges: Sadly, I am no longer a professor at ETH (@eth_en) due to very severe #longCovid and #MECFS.
ethrat.ch
Der ETH-Rat hat an seiner Sitzung vom 9./10. Juli 2025 Kenntnis genommen vom Rücktritt von Vanessa Wood als Vizepräsidentin der ETH Zürich. Sie verlässt die Schulleitung per Ende Dezember 2025 und...
0
144
0
RT @ssahoo_: Attending ICML ✈️Tues-Fri to present "The Diffusion Duality".🗓️Wed, July 16 @ 4:30pm.📍East Exhibition Hall A-B (E-3003). DM if….
0
17
0
RT @SkanderMoalla: 🚀 Big time! We can finally do LLM RL fine-tuning with rewards and leverage offline/off-policy data!. ❌ You want rewards,….
0
37
0
RT @XiuyingWei966: Curious about making Transformers faster on long sequences without compromising accuracy? ⚡️🧠 Meet RAT—an intermediate d….
0
9
0
RT @johnowhitaker: I did another video, on the paper 'The Diffusion Duality', continuing the series of me trying to understand diffusion ap….
0
32
0
RT @sedielem: This work uncovers a profound connection between continuous and discrete (non-absorbing) diffusion models, allowing transfer….
0
32
0
RT @_akhaliq: The Diffusion Duality. unlock few-step generation in discrete diffusion language models via the underlying Gaussian diffusion….
0
47
0
RT @SkyLi0n: Check out our recent paper on the "duality" between discrete and Gaussian diffusion. We show how you can exploit that relation….
0
5
0
@DrYangSong 🔗 CODE & MODELS:. 📜 Paper: 📘 Blog: 💻 Code: It was amazing to work on this with @ssahoo_, @SkyLi0n, @Guanghan__Wang, @justintchiu, @volokuleshov, onto the next 🚀.9/9 🧵.
github.com
[ICML 2025] The Diffusion Duality. Contribute to s-sahoo/duo development by creating an account on GitHub.
0
0
1
@DrYangSong 🎯 SUMMARY. • Duo bridges continuous and discrete diffusion.• Enables faster training with curriculum learning.• Enables faster sampling by adapting consistency models to discrete spaces.8/9.
1
0
1
@DrYangSong Using argmax at the final step cuts this to 8 steps, with only a slight drop in unigram entropy 🤯 7/9.
1
0
1
@DrYangSong Discrete Consistency Distillation slashes sampling steps by orders of magnitude: after distillation, we can sample in just 16 steps and match the original generative perplexity—without lowering unigram entropy. 6/9
1
0
1
Importantly, the Diffusion Duality lets us adapt Consistency Models (@DrYangSong) to discrete diffusion! 🔄✨ Even though there is no PF-ODE for discrete spaces, we can use the Gaussian PF-ODE to generate distillation trajectories, then transfer them to the discrete domain! 5/9
1
0
1