
Michael Churchill
@ChurchillMic
Followers
748
Following
16K
Media
151
Statuses
6K
Fusion energy computational research engineer/physicist, head of digital engineering @PPPLab #fusion #AI
Princeton, NJ
Joined March 2013
Excited to announce a new track of accelerating Generative AI: pi-Flow: Policy-Based Few-Step Generation via Imitation Distillation https://t.co/6ro55E1XGP Distill 20B flow models now using just an L2 loss via imitation learning for SOTA diversity and teacher-aligned quality.
2
13
77
Introducing General Intuition and our $133.7M Seed from Khosla Ventures, General Catalyst, and Raine. We build foundation models and general agents for environments that require deep spatial and temporal reasoning.
91
89
1K
No more reliance on VAE or DINO! Similar to the motivation of RAE by @sainingxie's team We propose EPG: SSL pre-training + end-to-end FT = SOTA FID on IN256/512! Works nicely for both DM and CM https://t.co/a2q4hKcBwB (1/n)đź§µ Next: Why training DMs on raw pixels is difficult?
10
41
257
Excited to start OpenAI for Physics w/ @ALupsasca @kevinweil @aleks_madry and @merettm! I sat with @ALupsasca when GPT-5 reproduced his latest research paper, and we both felt parallels to watching AlphaGo play move 37. It's nearly impossible to be a world class chess player
After GPT-5 Pro launched, I gave it that same problem. To my utter shock, it rediscovered the result in <30min! See for yourself: https://t.co/IpLuaGlJ03 It’s not flawless (it needs priming on the flat-space case before tackling the full problem) but the leap is incredible.
32
89
1K
We’re announcing a research collaboration with @CFS_energy, one of the world’s leading nuclear fusion companies. Together, we’re helping speed up the development of clean, safe, limitless fusion power with AI. ⚛️
73
508
3K
We trained OmniVLA, a robotic foundation model for navigation conditioned on language, goal poses, and images. Initialized with OpenVLA, it leverages Internet-scale knowledge for strong OOD performance. Great collaboration with @CatGlossop, @shahdhruv_, and @svlevine.
6
63
326
T2I models excel at realism, but true creativity means generating what doesn't exist yet. How do you prompt for something you can't describe? 🎨 We introduce VLM-Guided Adaptive Negative Prompting: inference time method that promotes creative image generation. 1/6
4
43
159
An exciting milestone for AI in science: Our C2S-Scale 27B foundation model, built with @Yale and based on Gemma, generated a novel hypothesis about cancer cellular behavior, which scientists experimentally validated in living cells. With more preclinical and clinical tests,
544
3K
22K
Excited about our new results on flow matching with formal constraints! See Linkedin post for more details: https://t.co/y3miWBnBLO
2
22
146
three years ago, DiT replaced the legacy unet with a transformer-based denoising backbone. we knew the bulky VAEs would be the next to go -- we just waited until we could do it right. today, we introduce Representation Autoencoders (RAE). >> Retire VAEs. Use RAEs. 👇(1/n)
55
325
2K
We cut the cost of training a diffusion model from months of rent to one night out. TREAD matches ImageNet performance of a DiT with 97% fewer A100 hours! No extra components. No extra losses. Training‑time only. Inference remains unchanged. Accepted at ICCV2025🌺
14
83
814
Excited to release new repo: nanochat! (it's among the most unhinged I've written). Unlike my earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single,
633
3K
23K
Introducing Geometry-aware Policy Imitation (GPI)! GPI constructs an energy landscape over the state space using demonstrations. A policy acts in the environment by following the gradient of the landscape. This enables fast multimodal policies with very fast inference (<1 ms)!
🎉 Excited to share Geometry-aware Policy Imitation (GPI): A simple, efficient, and interpretable approach for imitation learning. Delivers multimodal skills, stronger performance, 20–100× faster inference (<1 ms), and orders-of-magnitude less memory. https://t.co/YEUaiYwuQd
6
46
404
Multimodal gets a twist
Better Together: Leveraging Unpaired Multimodal Data for Stronger Unimodal Models "We introduce UML: Unpaired Multimodal Learner, a modality-agnostic training paradigm in which a single model alternately processes inputs from different modalities while sharing parameters across
0
0
0
I forgot about this tweet but read this top tier paper and get ultra agi pilled.
Few questions those who are following AlphaEvolve and FunSearch * is anyone reproducing it? * very relevant to diverse data generation in verifiable domains? * one step away from a new paradigm beyond current thinking: “solve this problem under x constraint”? 1. Makes use of
5
19
258
New paper 📜: Tiny Recursion Model (TRM) is a recursive reasoning approach with a tiny 7M parameters neural network that obtains 45% on ARC-AGI-1 and 8% on ARC-AGI-2, beating most LLMs. Blog: https://t.co/w5ZDsHDDPE Code: https://t.co/7UgKuD9Yll Paper:
arxiv.org
Hierarchical Reasoning Model (HRM) is a novel approach using two small neural networks recursing at different frequencies. This biologically inspired method beats Large Language models (LLMs) on...
134
634
4K
I am pleased to announce our new paper, which provides an extremely sample-efficient way to create an agent that can perform well in multi-agent, partially-observed, symbolic environments. The key idea is to use LLM-powered code synthesis to learn a code world model (in the form
16
98
778
Consistency models, CTMs, shortcut models, align your flow, mean flow... What's the connection, and how should you learn them in practice? We show they're all different sides of the same coin connected by one central object: the flow map. https://t.co/QBp1kELVhF đź§µ(1/n)
5
68
338