Yunzhi Zhang @zhang_yunzhi X Profile

Yunzhi Zhang

@zhang_yunzhi

Followers

2K

Following

584

Media

20

Statuses

66

CS PhD @Stanford

Stanford, CA

Joined December 2017

Don't wanna be here? Send us removal request.

Yunzhi Zhang

@zhang_yunzhi

6 hours

Nicely put @_kevinlu. A fruitful path: Not just getting offline human experience snapshots, but also Internet-scale samples of MDPs and deploying AI models to interact within. We are blessed that supervised pre-training warm-starts the system for better-informed exploration.

Kevin Lu

@_kevinlu

10 hours

Why you should stop working on RL research and instead work on product //.The technology that unlocked the big scaling shift in AI is the internet, not transformers. I think it's well known that data is the most important thing in AI, and also that researchers choose not to work

0

24

Yunzhi Zhang

@zhang_yunzhi

25 days

(5/5) Page: More details in paper: Team work with the incredible Carson Murtuza-Lanier, @zizhang_li, @du_yilun, and @jiajunwu_cs!.

1

13

Yunzhi Zhang

@zhang_yunzhi

25 days

(4/n) PoE sampling is non-trivial in high dimensions. We adopt Annealed Importance Sampling, where particles are initially drawn from a simple base distribution and steered towards the target, with transition kernels computed from expert models. Two possible annealing paths:

1

9

Yunzhi Zhang

@zhang_yunzhi

25 days

(3/n) …inserting graphics engine rendering into images, and more.

1

5

Yunzhi Zhang

@zhang_yunzhi

25 days

(2/n) The composition yields better controllability and provides flexible user interfaces for specifying visual synthesis goals, enabling applications such as composing physics simulation into generated videos…

1

8

Yunzhi Zhang

@zhang_yunzhi

25 days

(1/n) Time to unify your favorite visual generative models, VLMs, and simulators for controllable visual generation—Introducing a Product of Experts (PoE) framework for inference-time knowledge composition from heterogeneous models.

5

63

303

Yunzhi Zhang

@zhang_yunzhi

27 days

After session starts! @jesu9 leading discussions on concept recognition with VLMs.

Yunzhi Zhang

@zhang_yunzhi

27 days

Happening now in Room 101A! Daniel Ritchie opening up with programmatic visual concept representations. #CVPR2025

0

7

Yunzhi Zhang

@zhang_yunzhi

27 days

RT @flycooler_zd: 🚀 Excited to announce our CVPR 2025 Workshop: .3D Digital Twin: Progress, Challenges, and Future Directions .🗓 June 12,….

0

21

0

Yunzhi Zhang

@zhang_yunzhi

27 days

Happening now in Room 101A! Daniel Ritchie opening up with programmatic visual concept representations. #CVPR2025

0

1

25

Yunzhi Zhang

@zhang_yunzhi

3 months

The submission deadline for the Workshop on Visual Concepts @CVPR is extended to April 15. #CVPR2025. As visual generative and perception modeling rapidly evolve, it's a great time to join us (and an incredible speaker lineup!) for discussions. More info:

2

8

61

Yunzhi Zhang

@zhang_yunzhi

4 months

RT @GordonWetzstein: State-of-the-art zero-shot customized image generation by @prime_cai, Eric Chan, @zhang_yunzhi, Leo Guibas, @jiajunwu_….

0

10

0

Yunzhi Zhang

@zhang_yunzhi

6 months

RT @joycjhsu: Excited to bring back the 2nd Workshop on Visual Concepts at @CVPR 2025, this time with a call for papers!. We welcome submis….

0

23

0

Yunzhi Zhang

@zhang_yunzhi

7 months

New work on relightable 4D (:=3D + temporal) asset generation led by @gengchen01!.

Chen Geng

@gengchen01

7 months

Ever wondered how roses grow and wither in your backyard?🌹. Our latest work on generating 4D temporal object intrinsics lets you explore a rose's entire lifecycle—from birth to death—under any environment light, from any viewpoint, at any moment. Project page:

0

44

Yunzhi Zhang

@zhang_yunzhi

7 months

Two keys in our recipe for text+image-prompted image generation: data from self-distillation (critical w/ limited real data); an architecture casting image-to-image tasks as video frame synthesis, effectively injecting image controls to FLUX. Work led by the fantastic @prime_cai!.

Shengqu Cai

@prime_cai

7 months

Sharing something exciting we've been working on as a Thanksgiving gift: Diffusion Self-Distillation (DSD), which redefines zero-shot customized image generation using FLUX. DSD is like DreamBooth, but zero-shot/training-free. It works across any input subject and desired

2

5

58

Yunzhi Zhang

@zhang_yunzhi

7 months

This is a great opportunity to join @elliottszwu's new group! Working with Elliott has always been inspiring and fun---He brings incredible insights and depth to research. Excited to see what the lab will bring!.

Elliott / Shangzhe Wu

@elliottszwu

7 months

I'm building a new research lab @Cambridge_Eng focusing on 4D computer vision and generative models. Interested in joining us as a PhD student? Apply to the Engineering program by Dec 3 🗓️. ChatGPT's "portrait of my current life"👇.

0

1

37

Yunzhi Zhang

@zhang_yunzhi

8 months

Anecdotally, experimenting with the new Claude 3.5 model showed improved generation quality compared to its previous version—showing promise in a path where structural visual representation inference scales naturally with end-to-end models. Paper🔎: (3/3).

0

6

Yunzhi Zhang

@zhang_yunzhi

8 months

Visual scenes exhibit rich compositional structure. Our representation mirrors this organization via program function dependencies and provides a "machine interface" for LMs to decompose tasks following scene structures (e.g., full Sudoku board → single number placement). (2/3).

1

0

3

Yunzhi Zhang

@zhang_yunzhi

8 months

Get a peek into some (surprisingly accurate) 3D scenes generated via the Scene Language—our new scene representation! Now powered by the new Claude 3.5 Sonnet. More examples in our released repo: Why is the representation effective? (1/3)

Yunzhi Zhang

@zhang_yunzhi

9 months

Accurate and controllable scene generation has been difficult with natural language alone. You instead need a language for scenes. Introducing the Scene Language — a visual representation for high-quality 3D/4D generation by integrating programs, words, and embeddings — 🧵(1/6)

4

18

124

Yunzhi Zhang

@zhang_yunzhi

9 months

More results on project page: Paper: This is a very fun collaboration with the wonderful @zizhang_li, @Mattzh1314, @elliottszwu, and @jiajunwu_cs. (6/6).

1

3

30

Yunzhi Zhang

@zhang_yunzhi

9 months

The representation is not tied to one specific renderer; instead, it can be consumed by renderers ranging from end-to-end, neural generative models to traditional graphics engines. (5/6)

1

0

11