Yutong (Kelly) He Profile
Yutong (Kelly) He

@electronickale

Followers
2K
Following
1K
Media
23
Statuses
173

PhD student @mldcmu, I’m so delusional that doing generative modeling is my job

Pittsburgh, PA
Joined March 2021
Don't wanna be here? Send us removal request.
@electronickale
Yutong (Kelly) He
11 days
Diffusion/Flow-based models can sample in 1-2 steps now 👍 But likelihood? Still requires 100-1000 NFEs (even for these fast models) 😭 We fix this! Introducing F2D2: simultaneous fast sampling AND fast likelihood via joint flow map distillation. https://t.co/FFfqWnLIwu 1/🧵
9
72
420
@ThomasTCKZhang
Thomas Zhang
4 days
🤖🤖Very excited to finally share our new work “Action Chunking and Exploratory Data Collection Yield Exponential Improvements in Behavior Cloning for Continuous Control” Everyone in robotics does action-chunking, but why does it actually work?🤔🤔And, what can theory tell us
5
63
387
@_albertgu
Albert Gu
8 days
quite belated, but we finally uploaded "ARC-AGI Without Pretraining" to arXiv (link in reply) very impressive project by @LiaoIsaac91893 when he was just a first year PhD! he drove this entire project from beginning to end while I ate 🍿 at Neurips last week, Isaac was
@arcprize
ARC Prize
12 days
ARC Prize 2025 Winners Interviews Paper Award 3rd Place @LiaoIsaac91893 shares the story behind CompressARC - an MDL-based, single puzzle-trained neural code golf system that achieves ~20–34% on ARC-AGI-1 and ~4% on ARC-AGI-2 without any pretraining or external data.
6
14
198
@electronickale
Yutong (Kelly) He
11 days
This was a fun project with @KeelyAi04 (amazing undergrad applying to grad schools) @_albertgu @rsalakhu @zicokolter @nmboffi @max_simchowitz. We hope this unlocks new possibilities for flow-based models! Paper: https://t.co/FFfqWnLIwu Code:
Tweet card summary image
github.com
Joint Distillation for Fast Likelihood Evaluation and Sampling in Flow-based Models - Keely-Ai/F2D2
0
0
20
@electronickale
Yutong (Kelly) He
11 days
And we don't sacrifice sample quality! F2D2 variants maintain competitive FID while adding accurate likelihood. Even better: fast likelihood unlocks new tricks. With maximum likelihood self-guidance, we enable a 2-step MeanFlow to outperform 1024 step flow matching in FID 🤯
1
0
13
@electronickale
Yutong (Kelly) He
11 days
We tested F2D2 on CIFAR-10, ImageNet 64×64, and 2D synthetic data. Without F2D2, previous models fail to obtain valid likelihood estimations (negative BPD 💀) with few steps. With F2D2, we get calibrated likelihood close to flow matching with 100-1000 steps, using only 1-8 steps.
1
0
12
@electronickale
Yutong (Kelly) He
11 days
Best part? It's plug-and-play with any existing flow map model (Shortcut, MeanFlow, etc) Just add a divergence head to the existing model. That's it. Shared backbone, one head for sampling, one head for likelihood Train from scratch or finetune from pre-trained models, your call!
1
0
15
@electronickale
Yutong (Kelly) He
11 days
So we built F2D2 using flow maps, which skip slow ODE integration by learning to predict endpoints directly. F2D2 extends vanilla flow maps to the coupled system above: one model, jointly (self-)distilled to predict both sampling trajectory and cumulative divergence in parallel.
1
0
14
@electronickale
Yutong (Kelly) He
11 days
Turns out the solution was right in front of us: When computing likelihood in CNFs, you already get a coupled system of ODEs. Sampling and log-likelihood trajectory evolve together depending on the same velocity field. So why not just distill both together?
1
1
16
@electronickale
Yutong (Kelly) He
11 days
Why should you care about likelihood? If you're doing RL finetuning (PPO/GRPO) → you need it If you're hypothesis testing → you need it If you're doing cool applications like drug discovery → you really need it But right now for diffusion/flow it's 100x slower than sampling 🥲
2
0
20
@LiaoIsaac91893
Isaac Liao
15 days
Paper version + video interview for ARC-AGI Without Pretraining are now available! 📄Paper: https://t.co/XTOQHh4fzC 🎥Video interview:
1
6
34
@electronickale
Yutong (Kelly) He
21 days
To scale this class up, we need compute resources so that all students can train their models. I'll be at #NeurIPS 12/1-12/6, and if you're interested in sponsoring compute, I'd love to connect! Please DM me or grab me there 🙏🙏🙏! Course website:
2
2
27
@electronickale
Yutong (Kelly) He
21 days
Beyond the class itself, this is also an experiment in AI-native education and an attempt to solve the "holy grail" problem. I'm documenting the entire process and will share everything we learn publicly. I hope this can contribute a clearer roadmap for teaching in the age of AI
1
0
14
@electronickale
Yutong (Kelly) He
21 days
This is CMU's first course dedicated entirely to diffusion & flow matching, designed for 20 students but 139 signed up! We're scaling it to fit more people in-person and open-sourcing everything: slides, homework, and lecture recordings, so anyone in the world can learn with us!
1
0
18
@electronickale
Yutong (Kelly) He
21 days
In this class, students will build complete image generation systems from scratch via cumulative homework. They can choose their specialization (fidelity/speed/controllability) and tackle it with their own creativity. No exams and open everything: AI tools, open-source code, etc
1
1
9
@electronickale
Yutong (Kelly) He
21 days
I've always wanted to teach diffusion & flow matching, a math-heavy and often intimidating topic. LLM can do math now so traditional classes ❌ My take: "What I cannot create, I do not understand." Learning by building is robust even with AI. The key is what we ask them to build
2
0
8
@electronickale
Yutong (Kelly) He
21 days
This idea started at a group meeting where my advisor @zicokolter posed what he called the "holy grail" of education today: how do we ensure students are actually learning when AI can do everything for them? He and @rsalakhu encouraged me to find out by teaching a class @mldcmu
1
0
13
@electronickale
Yutong (Kelly) He
21 days
I'm teaching a diffusion & flow matching class at CMU in Spring 2026 where students can use ChatGPT, Cursor, or any AI tool they want. No exams. Just build with open internet. 139 students signed up for 20 spots. Here's what's happening: 🧵 https://t.co/t74V81OGiZ
25
58
380
@electronickale
Yutong (Kelly) He
22 days
Well, really didn’t expect this to age so well within a day but here we are
@electronickale
Yutong (Kelly) He
24 days
Doing ICLR and TMLR rebuttal at the same time is such a crazy experience. For ICLR I only got 2/7 reviewers to look at my rebuttal. For TMLR I got months-long discussions and my ac even went out their way to consult additional experts just to make sure my derivations are correct
2
0
15
@electronickale
Yutong (Kelly) He
24 days
Doing ICLR and TMLR rebuttal at the same time is such a crazy experience. For ICLR I only got 2/7 reviewers to look at my rebuttal. For TMLR I got months-long discussions and my ac even went out their way to consult additional experts just to make sure my derivations are correct
3
6
196