Aaron Lou
@aaron_lou
Followers
3K
Following
3K
Media
19
Statuses
242
Leading Strategic Explorations @OpenAI, prev @Stanford | Inventor of modern diffusion LMs
Joined August 2013
Exciting updates in the last month: 1. I got married!! 💕👩❤️👨💍 2. I joined OpenAI (currently feeling the AGI) 3. Our discrete diffusion paper recently won best paper at #ICML2024. I'll be giving a presentation at 10:30am in Hall A1, with the poster session shortly afterwords
40
8
943
I turned @karpathy's baby GPT into a character-level text diffusion model, using @aaron_lou et al.'s score entropy-based training objective.
17
54
978
Amazing work by Radical Numerics!
Introducing RND1, the most powerful base diffusion language model (DLM) to date. RND1 (Radical Numerics Diffusion) is an experimental DLM with 30B params (3B active) with a sparse MoE architecture. We are making it open source, releasing weights, training details, and code to
0
1
5
Introducing RND1, the most powerful base diffusion language model (DLM) to date. RND1 (Radical Numerics Diffusion) is an experimental DLM with 30B params (3B active) with a sparse MoE architecture. We are making it open source, releasing weights, training details, and code to
103
256
1K
Our general-purpose reasoning models solved all 12 problems at the 2025 International Collegiate Programming Contest (ICPC) World Finals, the world’s top university programming competition which was enough for a 1st-place human ranking.
1/n I’m really excited to share that our @OpenAI reasoning system got a perfect score of 12/12 during the 2025 ICPC World Finals, the premier collegiate programming competition where top university teams from around the world solve complex algorithmic problems. This would have
147
397
3K
1/n I’m really excited to share that our @OpenAI reasoning system got a perfect score of 12/12 during the 2025 ICPC World Finals, the premier collegiate programming competition where top university teams from around the world solve complex algorithmic problems. This would have
140
449
3K
Life update: I started Radical Numerics with Stefano Massaroli, Armin Thomas, Eric Nguyen, and a fantastic team of engineers and researchers. We are building the engine for recursive self‑improvement (RSI): AI that designs and refines AI, accelerating discovery across science and
8
23
234
✨ Excited to share a few life updates! 🎤 My TED Talk is now live! I shared the origin story of Evo, titled: "How AI could generate new life forms" TED talk: https://t.co/dh7iWcPaBu ✍️ I wrote a blog post about what it’s *really* like to deliver a TED talk blog:
ted.com
If DNA is just a string of letters, could AI learn to read it … or even write it? Bioengineering researcher Eric Nguyen reveals how AI has upended the rules of biology, potentially creating a future...
19
28
175
🚀Excited to announce Dream 7B (Diffusion reasoning model): the most powerful open diffusion large language model to date.
49
206
1K
Large Language Diffusion Models Introduces LLaDA-8B, a large language diffusion model that pretrained on 2.3 trillion tokens using 0.13 million H800 GPU hours, followed by SFT on 4.5 million pairs. LLaDA 8B surpasses Llama-2 7B on nearly all 15 standard zero/few-shot learning
37
281
2K
It's tough times for science. 🥺 But we have to keep innovating to fight another day, and today I'm so proud to share @pengzhangzhi1's new, groundbreaking sampling algorithm for generative language models, Path Planning (P2). 🌟 📜: https://t.co/fWjT9NkgX3 💻: In the appendix!!
3
32
187
1
1
38
Evo has been published in @Science! A true privilege to work with such an amazing team! So many exciting new experimental results in this emerging field of Generative Genomics, including AI generated and *validated* CRISPR-Cas systems and transposons
A new Science study presents “Evo”—a machine learning model capable of decoding and designing DNA, RNA, and protein sequences, from molecular to genome scale, with unparalleled accuracy. Evo’s ability to predict, generate, and engineer entire genomic sequences could change the
8
55
259
A new Science study presents “Evo”—a machine learning model capable of decoding and designing DNA, RNA, and protein sequences, from molecular to genome scale, with unparalleled accuracy. Evo’s ability to predict, generate, and engineer entire genomic sequences could change the
53
425
1K
🧬Evo, the first foundation model trained at scale on DNA, is a Rosetta Stone for biology. DNA, RNA, and proteins are the fundamental molecules of life—and cracking the code of their complex language is an ongoing grand challenge. 🔬Today in @ScienceMagazine, the labs of Arc
8
110
445
📢Annoucing EDLM, our brand-new Energy-based Language Model embedded with Diffusion framework! Key results: 1. We (for the first time?) almost match AR perplexity. 2. Significantly improved generation quality. 3. Considerable sampling speedup without quality drop. 🧵1/n
6
58
294
Excited to share our paper “Derivative-Free Guidance in Continuous and Discrete Diffusion Models with Soft Value-Based Decoding”. My amazing coauthors: @zhaoyl18thu, @ChenyuW64562111, @gabo_scalia, @gokcen, @suragnair, @tbyanc, @ShuiwangJi, Aviv Regev, @svlevine, Masatoshi Uehara
3
11
55
Very exciting work from @jdeschena on fast discrete diffusion models! I'm especially excited since I had lost hope for such distillation techniques.
🌟 Excited to share our latest work on making diffusion language models (DLMs) faster than autoregressive (AR) models! ⚡ It’s been great to work on this with @caglarml 😎 Lately, DLMs are gaining traction as a promising alternative to autoregressive sequence modeling 👀 1/14 🧵
1
0
7
🌟 Excited to share our latest work on making diffusion language models (DLMs) faster than autoregressive (AR) models! ⚡ It’s been great to work on this with @caglarml 😎 Lately, DLMs are gaining traction as a promising alternative to autoregressive sequence modeling 👀 1/14 🧵
2
59
267
Glad to have worked with @jiaqihan99 on this project. Very promising results for temporal generative models!
📣Check out our #NeurIPS24 paper Geometric Trajectory Diffusion Models (GeoTDM), a new diffusion-based generative model that captures the temporal evolution of the ubiquitous geometric systems!! Paper: https://t.co/wy8lQGxaNe Code: https://t.co/6FW1UiRCZV 🧵1/8
0
0
11
Discrete Diffusion Models (1.1B) 1. show competitive results in terms of zero-shot benchmarks. 2. x1.5 faster with comparable perf. 3. do not suffer _reverse curse_ thanks to bidirectional attn.
4
23
157