GillmanLab Profile Banner
Nate Gillman Profile
Nate Gillman

@GillmanLab

Followers
805
Following
295
Media
26
Statuses
131

ML researcher, interning @Google, PhD-ing @BrownUniversity. I train deep generative models

Joined August 2021
Don't wanna be here? Send us removal request.
@GillmanLab
Nate Gillman
7 months
Ever wish you could turn your video generator into a controllable physics simulator? We're thrilled to introduce Force Prompting! Animate any image with physical forces and get fine-grained control, without needing any physics simulator or 3D assets at inference. 🧵(1/n)
8
70
317
@chenwangcw
Chen Wang @ NeurIPS 2025
8 days
Excited to share our #NeurIPS2025 paper: PhysCtrl: Generative Physics for Controllable and Physics-Grounded Video Generation. We propose a novel framework to improve the controllability and physics plausibility of video models. Project Page: https://t.co/MENLbVHsjo (1/n)
5
33
181
@Afinetheorem
Kevin A. Bryan
9 days
A fun theorem (critical for why much of machine learning works!): higher-dimensional surfaces have relatively more saddle points that local minima, so "roll the ball downhill" gradient descent works better with *bigger* models. Surprising if you haven't thought about this! 1/3
16
43
639
@xxunhuang
Xun Huang
1 month
We present MotionStream — real-time, long-duration video generation that you can interactively control just by dragging your mouse. All videos here are raw, real-time screen captures without any post-processing. Model runs on a single H100 at 29 FPS and 0.4s latency.
36
150
1K
@sarahookr
Sara Hooker
1 month
Adaptable Intelligence. Multiple possible paths to an objective.
196
1K
16K
@StefanoErmon
Stefano Ermon
1 month
Tired of chasing references across dozens of papers? This monograph distills it all: the principles, intuition, and math behind diffusion models. Thrilled to share!
@JCJesseLai
Chieh-Hsin (Jesse) Lai
1 month
Tired to go back to the original papers again and again? Our monograph: a systematic and fundamental recipe you can rely on! 📘 We’re excited to release 《The Principles of Diffusion Models》— with @DrYangSong, @gimdong58085414, @mittu1204, and @StefanoErmon. It traces the core
13
136
1K
@jianfengzhang95
Jianfeng Zhang
2 months
🚀 Thrilled to introduce Seed3D 1.0, a foundation model that generates High-Fidelity, Simulation-Ready 3D Assets directly from a Single Image! ✨ Key Capabilities: 1️⃣ High-fidelity Assets: Generates assets with accurate geometry, well-aligned textures, and physically-based
30
138
993
@GYanjiang
Yanjiang Guo
2 months
Rollouts in the real world are slow and expensive. What if we could rollout trajectories entirely inside a world model (WM)? Introducing 🚀Ctrl-World🚀, a generative manipulation WM that can interact with advanced VLA policy in imagination. 🧵1/6
5
39
210
@jieneng_chen
Jieneng Chen
2 months
🤯 Think better visuals mean better world models? Think again. 💥 Surprise: Agents don’t need eye candy— they need wins. Meet World-in-World, the first open benchmark that ranks world models by closed-loop task success, not pixels. We uncover 3 shocks: 1️⃣ Visuals ≠ utility 2️⃣
2
40
144
@zitian_tang
ZitianTang
2 months
We are excited to present our work “How Can Objects Help Video-Language Understanding” at ICCV 2025 in Hawaii! We boost MLLMs’ spatiotemporal understanding with object-centric computer vision models. Come and visit our poster to chat about multimodal understanding! 🕚 Time:
0
2
4
@krea_ai
KREA AI
2 months
today we're open-sourcing Krea Realtime. this 14B autoregressive model is 10x larger than any open-source equivalent, and it can generate long-form videos at 11 fps on a single B200. weights and technical report below 👇
61
203
1K
@GillmanLab
Nate Gillman
2 months
congrats to the Veo team... love the added controllability!
@GoogleDeepMind
Google DeepMind
2 months
Veo is getting a major upgrade. 🚀 We’re rolling out Veo 3.1, our updated video generation model, alongside improved creative controls for filmmakers, storytellers, and developers - many of them with audio. 🧵
0
0
2
@GillmanLab
Nate Gillman
2 months
VERY nice paper from @SinaAlmd et al!! It's been well-established that moving weights in the direction where synthetic data wants them to go is bad for the model... so it's only natural that moving the weights in the opposite direction consistently helps the model. Excellent!!
@SinaAlmd
Sina Alemohammad
2 months
Synthetic data promised to shatter data scarcity barriers, but self-generated samples trigger catastrophic model collapse. We discovered the key is thinking in reverse: degradation from self-training isn't random noise—it's a powerful signal provably anti-aligned with the
0
0
4
@sainingxie
Saining Xie
2 months
three years ago, DiT replaced the legacy unet with a transformer-based denoising backbone. we knew the bulky VAEs would be the next to go -- we just waited until we could do it right. today, we introduce Representation Autoencoders (RAE). >> Retire VAEs. Use RAEs. 👇(1/n)
56
334
2K
@du_yilun
Yilun Du
2 months
Excited to share Equilibrium Matching (EqM)! EqM simplifies and outperforms flow matching, enabling strong generative performance of FID 1.96 on ImageNet 256x256. EqM learns a single static EBM landscape for generation, enabling a simple gradient-based generation procedure.
20
173
1K
@randall_balestr
Randall Balestriero
2 months
Joint embeddings (JEPAs) and density estimation/generative models seem to be like oil and water. Yet, we prove how a good JEPA is also a good density estimator! And JEPAs achieve that without input space reconstruction, get p(x) from any pretrained model! https://t.co/VX94vKHlYK
9
49
353
@Jaeyeon_Kim_0
Jaeyeon (Jay) Kim
2 months
We introduce a new ''rule'' for understanding diffusion models: Selective Underfitting. It explains: 🚨 How diffusion models generalize beyond training data 🚨 Why popular training recipes (e.g., DiT, REPA) are effective and scale well Co-led with @kiwhansong0! (1/n)
8
65
420
@sherwinbahmani
Sherwin Bahmani
3 months
📢 Lyra: Generative 3D Scene Reconstruction via Video Diffusion Model Self-Distillation Got only one or a few images and wondering if recovering the 3D environment is a reconstruction or generation problem? Why not do it with a generative reconstruction model! We show that a
19
70
250
@GordonWetzstein
Gordon Wetzstein
3 months
How do we generate videos on the scale of minutes, without drifting or forgetting about the historical context? We introduce Mixture of Contexts. Every minute-long video below is the direct output of our model in a single pass, with no post-processing, stitching, or editing. 1/4
22
98
587
@prime_cai
Shengqu Cai
3 months
Some random thoughts I've been having about video world model/long video generation since working on Mixture of Contexts (whose title could also be "Learnable Sparse Attention for Long Video Generation"): 🚨Semi-long Post Alert🚨 1. Learnable sparse attention is still underrated
@GordonWetzstein
Gordon Wetzstein
3 months
How do we generate videos on the scale of minutes, without drifting or forgetting about the historical context? We introduce Mixture of Contexts. Every minute-long video below is the direct output of our model in a single pass, with no post-processing, stitching, or editing. 1/4
6
38
224
@xxunhuang
Xun Huang
3 months
Accepted by #NeurIPS2025 as a spotlight!
@xxunhuang
Xun Huang
6 months
Real-time video generation is finally real — without sacrificing quality. Introducing Self-Forcing, a new paradigm for training autoregressive diffusion models. The key to high quality? Simulate the inference process during training by unrolling transformers with KV caching.
4
14
183