
Stefano Ermon
@StefanoErmon
Followers
17K
Following
936
Media
22
Statuses
442
Associate Professor of #computerscience @Stanford #AI #ML
Joined February 2013
Excited to share that I’ve been working on scaling up diffusion language models at Inception. A new generation of LLMs with unprecedented capabilities is coming!.
We are excited to introduce Mercury, the first commercial-grade diffusion large language model (dLLM)! dLLMs push the frontier of intelligence and speed with parallel, coarse-to-fine text generation.
37
81
691
Super proud of my student Aditya who successfully defended his #PhD dissertation today! He has done awesome work on unsupervised learning with generative models. Congrats, Dr. @adityagrover_ 👏🎊🎉
36
23
589
If all training images for a GAN/VAE/PixelCNN have 2 objects, will they only generate images with 2 objects? If trained on (🔵,💙,🔴), will they also generate ❤️? Find out in @shengjia_zhao's blog post on generalization and bias for generative models. 👉
1
129
501
Thrilled to share that our paper "Comparing Distributions by Measuring Differences that Affect Decision Making" wins #ICLR2022 Outstanding Paper Award🎉Congratulations to my awesome students @shengjia_zhao @a7b2_3 @electronickale Aidan @baaadas👏
13
41
435
Diffusion models are state-of-the-art for continuous data generation (images, videos, etc). Can they beat autoregressive models also on text generation? Check out our ICML paper tomorrow to find out how. Congrats to my students @aaron_lou @chenlin_meng for the best paper award!.
11
29
304
Very excited about this work: diffusion models finally bridging the gap with autoregressive models on language!.
Announcing Score Entropy Discrete Diffusion (SEDD) w/ @chenlin_meng @StefanoErmon. SEDD challenges the autoregressive language paradigm, beating GPT-2 on perplexity and quality!. Arxiv: Code: Blog: 🧵1/n
4
25
297
A paper blatantly plagiarized our CTM paper (see some of their verbatim copy&paste below). Feeling bad for my junior collaborators @gimdong58085414 and @JCJesseLai who worked so hard on this.
We sadly found out our CTM paper (ICLR24) was plagiarized by TCD! It's unbelievable😢—they not only stole our idea of trajectory consistency but also comitted "verbatim plagiarism," literally copying our proofs word for word! Please help me spread this.
8
24
201
Really excited about this work! We figured out how to apply DPO (Direct Preference Optimization) to diffusion models and got big improvements in image quality.
Excited to announce DPO has gone multi-modal! New paper out on RLHF for text-to-image diffusion models! We obtain large-scale state of the art results with 70% win rates against Stable Diffusion XL on human evals! Deep dive below 🧵
2
21
190
New way to learn deep energy models on high dimensional data combining score matching with random projections. Same theoretical guarantees but much faster in practice! .@YSongStanford talk today at UAI at 11:40. Blog post:
1
33
150
Check out our new research blog 🎙🎙First post on “Hierarchical Generative Models” by Jiaming Song @baaadas and Shengjia Zhao @shengjia_zhao Comments welcome!
0
44
135
They’re here. 🔥.Inception’s diffusion LLMs — lightning fast, state-of-the-art, and now public. Go build the future → #GenAI #dLLMs #diffusion.
We are launching our API in open beta! Visit the Inception Platform to create your account and get started using the first commercial-scale diffusion large language models (dLLMs).
1
15
136
5 papers accepted at #ICML2019 👏Congrats to my students and coauthors 👍@adityagrover_ @kristyechoi @yulantao1996 @shengjia_zhao @baaadas @volkuleshov . Graph Generative Modeling: Neural Joint Source-Channel Coding:
0
5
124
Very excited about this work! A principled way to handle boundaries (e.g., enforce pixel values to be in [0, 255]) in diffusion models that is elegant, scalable, and improves performance in practice.
Presenting Reflected Diffusion Models w/ @StefanoErmon!. Diffusion models should reverse a SDE, but common hacks break this. We provide a fix through a general framework. Arxiv: Github: Blog: 🧵(1/n)
2
19
115
I am truly honored. Looking forward to seeing you all in Stockholm this summer. @IJCAI_ECAI_18
A great pleasure to announce. Congratulations Stefano Ermon, @ermonste, Stanford University, the winner of the 2018 IJCAI Computers and Thought Award.
5
4
89
Nice unification of various guidance methods for diffusion models and extensive benchmark!.
💡 From prediction to generation: Training-Free Guidance for Diffusion (NeurIPS Spotlight). How do we use any off-the-shelf predictor and unconditional diffusion sampler for conditional generation without any training? Our framework, TFG, boosts performance works across 7 models,
0
6
89
New blog post by Aditya Grover @adityagrover_ on our #AISTATS paper "Uncertainty Autoencoders: Learning Compressed Representations via Variational Information Maximization” 🎙🎙Paper �Check out our poster today 15:40-18:40@Foyer
1
28
86
New blog post on our #AAAI2018 paper on approximate inference using Rademacher complexity. Surprising connections between inference and learning theory! .👉
1
29
85
How Machine Learning helps with economic growth, food security, and access to infrastructure in developing countries 🌍 Joint work with @DavidBLobell @MarshallBBurke #ArtificialIntelligence #MachineLearning.
Could machine learning put impoverished communities back on the map? By discerning patterns in satellite imagery, researchers hope to help national leaders and international agencies assist poverty-stricken regions.
0
19
81
Thrilled to be at @DeepIndaba in Nairobi! Today I will present our work on AI for sustainable development 🌍 with @atlasai_co & @RockefellerFdn 📡 14:30-16:00 @ EBR 100 #DLIndaba2019 #SautiYetu
1
14
73
New blog post on our #AAAI2018 FlowGAN paper by @adityagrover_ and Manik Dhar: surprising results comparing GAN vs. mixture of Gaussians!.(btw Manik is applying to PhD programs - check out his folder if you are hiring).
0
28
69
We figured out how to use parallel-in-time solvers (Picard) to accelerate arbitrary computation graphs. You trade parallel compute (use more GPUs) to reduce wallclock time, adding concurrency to otherwise sequential computations. Big speedups on text-to-3D generative models!.
📢Text-to-3D generation via score distillation (DreamFusion, ProlificDreamer, etc.) produces high-quality 3D assets, but can take up to 10 hours to run. We present an acceleration method for all existing approaches based on score distillation and achieves up to 4.7x speedup.
0
8
62
At @Stanford's 128th commencement🎓Congratulations to my Ph.D. student @nealjean1 👏 So proud of you!
0
0
63
Amazing work by my student @MichaelPoli6 on the cover of Science! Fantastic collaboration.
An absolute privilege to see our work on Evo🧬 highlighted on the cover of the latest issue of Science. Thank you to all the friends and collaborators at Stanford (@StanfordAILab) and the Arc Institute (@arcinstitute) @exnx @BrianHie @pdhsu @HazyResearch @StefanoErmon and more.
3
2
63
Great work led by my student @linqi_zhou !.
📢 Diffusion models (DM) generate samples from noise distribution, but for tasks such as image-to-image translation, one side is no longer noise. We present Denoising Diffusion Bridge Models, a simple and scalable extension to DMs suitable for distribution translation problems.
3
6
63
Come to our #AAAI19 Oral Wed 11:30am @ Coral 2 🎙️Tile2Vec: Unsupervised Representation Learning for Spatially.Distributed Data 🗺️🗺️Without any labeled data, performance comparable to supervised CNNs trained on 50k+ labels for land cover classification task
0
16
57
New blog post on our #AISTATS2018 paper on variational rejection sampling with @adityagrover_ . New differentiable rejection sampling step to tighten ELBO as much as desired. Bonus: ELBO gradients with unnormalized proposals. 👉
2
17
53
Check out our #AAAI19 spotlight by @shengjia_zhao 11:30am@Kahili🎙️InfoVAE: Balancing Learning and Inference in Variational Autoencoders (🎙️Poster today 6:30pm🎙️Tutorial on simple implementation & more informative features in VAEs
0
10
54
@sedielem Another fun one from @Andrea__M : diffusion models as stochastic localization algorithms!
1
4
47
Enroll in this course for a rigorous introduction to the fundamentals of Generative AI.
Unlock the power of generative models in our upcoming course! Deep Generative Models taught by @StefanoErmon starts Jan. 27 and runs for 10 weeks. ➡️ Enroll now:
1
2
45
Are you using ML predictions to guide downstream decision-making? Check out Shengjia's paper to learn how to existing notions of calibration affect performance, and how to pick the right one.
🚨[New Paper] Should you trust calibrated predictions for high-stakes decisions? Check out to see what calibration actually means for decision-making, and a new decision-tailored calibration notion. With @mikekimbackward, Roshni, @tengyuma, @StefanoErmon
0
5
44
We analyzed what LLMs know about the world, uncovering more problematic biases.
How much do LLMs like ChatGPT and Gemini know about where you live? How biased are they? Do they think Europeans are more intelligent than Africans? What about attractiveness or morality? Here is the answer. 🚨 LLMs are Geographically Biased 🚨. 1/6
1
5
37
Congratulations to my students Jiaming Song @baaadas and Shengjia Zhao @shengjia_zhao for winning the 2018 @Qualcomm Innovation Fellowship 👏👏👏
1
2
38
Excited by @cartesia_ai release! Text2speech is just the beginning, their new architectures are going to revolutionize this space!.
Today, we’re excited to release the first step in our mission to build real time multimodal intelligence for every device: Sonic, a blazing fast (🚀 135ms model latency), lifelike generative voice model and API. Read and try Sonic
1
2
36
Congratulations on the launch @pika_labs! Amazing work!.
Introducing Pika 1.0, the idea-to-video platform that brings your creativity to life. Create and edit your videos with AI. Rolling out to new users on web and discord, starting today. Sign up at
0
0
37
Come to our #NeurIPS2018 spotlight today 4:05PM @ Room 220 E. @shengjia_zhao will present our work "Bias and Generalization in Deep Generative Models". Poster today 5PM @ Room 210 #6. Check out the paper 👉
0
5
36
@NandoDF We've been exploring IL techniques for this:.1) Modify MLE training (BC) to encourage samples to stay close to data.2) Add a "backspace" action. The policy/LLM backtracks if the sample is scored low by the reward/discriminator.Of course much more to do!.
0
4
33
Excited to see Atlas AI pushing the envelope of AI-driven economic estimates from satellite images! We are getting one step closer to a digital twin of the world's economy.
Proud to share @atlasai_co's newly announced $5.5 million collaboration with e-GUIDE, aimed at accelerating #EconomicDevelopment and promoting #ClimateResilient #infrastructure investment across sub-Saharan Africa.
0
5
33
Excited about this work! .1) Replaces MLE training (=Behavior Cloning) to encourage generated samples to stay close to demonstrations.2) Adds a "backspace" action. The policy/LLM backtracks if the sample is scored low by the (implicitly learned) reward!.
Introducing *SequenceMatch*, training LLMs with an imitation learning loss. Avoids compounding error in generation by:.1. Training against *different divergences* like χ^2 with more support OOD.2. Adding a *backspace* action: model can correct errors!.1/7
0
8
31
@emiel_hoogeboom Nice work! You might be interested in Consistency Trajectory Models (CTM, accepted at ICLR-24, , I believe it's the same idea and it gets even better distillation results on Imagenet64 (1.79 FID in 2 steps)!.
2
3
32
Congratulations!!!.
We've raised a $64M Series A led by @kleinerperkins to build the platform for real-time voice AI. We'll use this funding to expand our team, and to build the next generation of models, infrastructure, and products for voice, starting with Sonic 2.0, available today. Link below
0
1
26
Congratulations!!!.
Thrilled to share that my PhD dissertation won the ACM SIGKDD Dissertation Award for "outstanding work in data science and machine learning". Thanks to everyone involved, especially my advisor @StefanoErmon & @StanfordAILab!.
1
0
23
Thanks for testing the models! Excited to hear the results are so strong!.
Our full results for @InceptionAILabs Mercury model. Key takeaways:. 🧮Seems to do extremely well on mathematics and coding. ⏩Is extremely fast, with a low latency for most evals. Again, a great job by the @InceptionAILabs team!
2
2
27
Please stop by our 📢#NIPS2017 Poster #199 tonight on InfoGAIL 📢 w/ Yunzhu Li and @baaadas. #selfdriving from raw visual inputs and disentangled representations of behaviors. Learned with implicit generative models and #DeepLearning RL. 🏎️… 🏎️
0
2
23
New blog post by Pratyusha Kalluri on our #AISTATS paper “Learning Controllable Fair Representations” Paper👉 👉Check out poster today 15:50-18:50@Ryugu by Jiaming Song @baaadas @adityagrover_ @shengjia_zhao
0
4
23
Going beyond national level statistics with machine learning! Inspiring blog post by @MarshallBBurke @ApoorvaRed @atlasai_co .
0
3
17
great article by my student @radical_ai_ on how AI is shifting power published on Nature today.
0
2
15
Delighted to be speaking today at the ML for Developing World workshop at #NIPS2017. Come check out our latest work at 3:30 in Seaside7.
0
1
10
@berty38 Hoeffding-like bounds that hold even when you adaptively decide when to stop (with a negligible doubly logarithmic slack):.
0
2
5
@zuess05 @InceptionAILabs We’re thrilled to hear you’re enjoying it. If you have any other feedback or thoughts, feel free to share.
2
0
3
@roydanroy @ceobillionaire @YSongStanford We derived it from a different perspective in the paper. It's equivalent to Hutchinson when you integrate a term analytically. See discussion in the paper for details.
1
0
3