Yunzhi Zhang Profile
Yunzhi Zhang

@zhang_yunzhi

Followers
2K
Following
584
Media
20
Statuses
66

CS PhD @Stanford

Stanford, CA
Joined December 2017
Don't wanna be here? Send us removal request.
@zhang_yunzhi
Yunzhi Zhang
6 hours
Nicely put @_kevinlu. A fruitful path: Not just getting offline human experience snapshots, but also Internet-scale samples of MDPs and deploying AI models to interact within. We are blessed that supervised pre-training warm-starts the system for better-informed exploration.
@_kevinlu
Kevin Lu
10 hours
Why you should stop working on RL research and instead work on product //.The technology that unlocked the big scaling shift in AI is the internet, not transformers. I think it's well known that data is the most important thing in AI, and also that researchers choose not to work
Tweet media one
0
0
24
@zhang_yunzhi
Yunzhi Zhang
25 days
(5/5) Page: More details in paper: Team work with the incredible Carson Murtuza-Lanier, @zizhang_li, @du_yilun, and @jiajunwu_cs!.
1
1
13
@zhang_yunzhi
Yunzhi Zhang
25 days
(4/n) PoE sampling is non-trivial in high dimensions. We adopt Annealed Importance Sampling, where particles are initially drawn from a simple base distribution and steered towards the target, with transition kernels computed from expert models. Two possible annealing paths:
Tweet media one
1
1
9
@zhang_yunzhi
Yunzhi Zhang
25 days
(3/n) …inserting graphics engine rendering into images, and more.
Tweet media one
1
1
5
@zhang_yunzhi
Yunzhi Zhang
25 days
(2/n) The composition yields better controllability and provides flexible user interfaces for specifying visual synthesis goals, enabling applications such as composing physics simulation into generated videos…
1
1
8
@zhang_yunzhi
Yunzhi Zhang
25 days
(1/n) Time to unify your favorite visual generative models, VLMs, and simulators for controllable visual generation—Introducing a Product of Experts (PoE) framework for inference-time knowledge composition from heterogeneous models.
5
63
303
@zhang_yunzhi
Yunzhi Zhang
27 days
After session starts! @jesu9 leading discussions on concept recognition with VLMs.
@zhang_yunzhi
Yunzhi Zhang
27 days
Happening now in Room 101A! Daniel Ritchie opening up with programmatic visual concept representations. #CVPR2025
Tweet media one
0
0
7
@zhang_yunzhi
Yunzhi Zhang
27 days
RT @flycooler_zd: 🚀 Excited to announce our CVPR 2025 Workshop: .3D Digital Twin: Progress, Challenges, and Future Directions .🗓 June 12,….
0
21
0
@zhang_yunzhi
Yunzhi Zhang
27 days
Happening now in Room 101A! Daniel Ritchie opening up with programmatic visual concept representations. #CVPR2025
Tweet media one
0
1
25
@zhang_yunzhi
Yunzhi Zhang
3 months
The submission deadline for the Workshop on Visual Concepts @CVPR is extended to April 15. #CVPR2025. As visual generative and perception modeling rapidly evolve, it's a great time to join us (and an incredible speaker lineup!) for discussions. More info:
Tweet media one
2
8
61
@zhang_yunzhi
Yunzhi Zhang
4 months
RT @GordonWetzstein: State-of-the-art zero-shot customized image generation by @prime_cai, Eric Chan, @zhang_yunzhi, Leo Guibas, @jiajunwu_….
0
10
0
@zhang_yunzhi
Yunzhi Zhang
6 months
RT @joycjhsu: Excited to bring back the 2nd Workshop on Visual Concepts at @CVPR 2025, this time with a call for papers!. We welcome submis….
0
23
0
@zhang_yunzhi
Yunzhi Zhang
7 months
New work on relightable 4D (:=3D + temporal) asset generation led by @gengchen01!.
@gengchen01
Chen Geng
7 months
Ever wondered how roses grow and wither in your backyard?🌹. Our latest work on generating 4D temporal object intrinsics lets you explore a rose's entire lifecycle—from birth to death—under any environment light, from any viewpoint, at any moment. Project page:
0
0
44
@zhang_yunzhi
Yunzhi Zhang
7 months
Two keys in our recipe for text+image-prompted image generation: data from self-distillation (critical w/ limited real data); an architecture casting image-to-image tasks as video frame synthesis, effectively injecting image controls to FLUX. Work led by the fantastic @prime_cai!.
@prime_cai
Shengqu Cai
7 months
Sharing something exciting we've been working on as a Thanksgiving gift: Diffusion Self-Distillation (DSD), which redefines zero-shot customized image generation using FLUX. DSD is like DreamBooth, but zero-shot/training-free. It works across any input subject and desired
2
5
58
@zhang_yunzhi
Yunzhi Zhang
7 months
This is a great opportunity to join @elliottszwu's new group! Working with Elliott has always been inspiring and fun---He brings incredible insights and depth to research. Excited to see what the lab will bring!.
@elliottszwu
Elliott / Shangzhe Wu
7 months
I'm building a new research lab @Cambridge_Eng focusing on 4D computer vision and generative models. Interested in joining us as a PhD student? Apply to the Engineering program by Dec 3 🗓️. ChatGPT's "portrait of my current life"👇.
Tweet media one
0
1
37
@zhang_yunzhi
Yunzhi Zhang
8 months
Anecdotally, experimenting with the new Claude 3.5 model showed improved generation quality compared to its previous version—showing promise in a path where structural visual representation inference scales naturally with end-to-end models. Paper🔎: (3/3).
0
0
6
@zhang_yunzhi
Yunzhi Zhang
8 months
Visual scenes exhibit rich compositional structure. Our representation mirrors this organization via program function dependencies and provides a "machine interface" for LMs to decompose tasks following scene structures (e.g., full Sudoku board → single number placement). (2/3).
1
0
3
@zhang_yunzhi
Yunzhi Zhang
8 months
Get a peek into some (surprisingly accurate) 3D scenes generated via the Scene Language—our new scene representation! Now powered by the new Claude 3.5 Sonnet. More examples in our released repo: Why is the representation effective? (1/3)
Tweet media one
Tweet media two
Tweet media three
Tweet media four
@zhang_yunzhi
Yunzhi Zhang
9 months
Accurate and controllable scene generation has been difficult with natural language alone. You instead need a language for scenes. Introducing the Scene Language — a visual representation for high-quality 3D/4D generation by integrating programs, words, and embeddings — 🧵(1/6)
4
18
124
@zhang_yunzhi
Yunzhi Zhang
9 months
More results on project page: Paper: This is a very fun collaboration with the wonderful @zizhang_li, @Mattzh1314, @elliottszwu, and @jiajunwu_cs. (6/6).
1
3
30
@zhang_yunzhi
Yunzhi Zhang
9 months
The representation is not tied to one specific renderer; instead, it can be consumed by renderers ranging from end-to-end, neural generative models to traditional graphics engines. (5/6)
1
0
11