Haven (Haiwen) Feng @HavenFeng X Profile

Haven (Haiwen) Feng

@HavenFeng

Followers

1K

Following

2K

Media

12

Statuses

155

PhD student @MPI_IS, visiting @berkeley_ai now. Interested in machine learning, computer vision, computer graphic, and how to understand the physical world.

Germany

Joined October 2021

Don't wanna be here? Send us removal request.

Haven (Haiwen) Feng

@HavenFeng

23 days

🚀 Introducing GenLit – Reformulating Single-Image Relighting as Video Generation! We leverage video diffusion models to perform realistic near-field relighting from just a single image—No explicit 3D reconstruction or ray tracing required! No intermediate graphics buffers,

3

20

108

Haven (Haiwen) Feng

@HavenFeng

3 days

See you in Hawaii🖖🏝️.

Junyi Zhang

@junyi42

3 months

Introducing St4RTrack!🖖. Simultaneous 4D Reconstruction and Tracking in the world coordinate feed-forwardly, just by changing the meaning of two pointmaps!.

3

54

Haven (Haiwen) Feng

@HavenFeng

4 days

RT @xiuyu_l: Sparsity can make your LoRA fine-tuning go brrr 💨. Announcing SparseLoRA (ICML 2025): up to 1.6-1.9x faster LLM fine-tuning (2….

0

57

0

Haven (Haiwen) Feng

@HavenFeng

18 days

RT @seohong_park: Q-learning is not yet scalable. I wrote a blog post about my thoughts on scalable RL algorithms.….

0

186

0

Haven (Haiwen) Feng

@HavenFeng

23 days

🧠 Takeaway: Our approach demonstrates the untapped potential of foundation models—they inherently understand enough physics to tackle complex graphics tasks, bridging generative AI with practical applications in AR, product visualization, and creative editing. Kudos to our.

0

4

Haven (Haiwen) Feng

@HavenFeng

23 days

📈 Results? GenLit outperforms prior state-of-the-art methods across synthetic and real benchmarks, including the challenging MIT Multi-Illumination dataset, with significant gains in visual realism (PSNR, LPIPS, SSIM)! 🧵 4/5

1

0

3

Haven (Haiwen) Feng

@HavenFeng

23 days

💡 Key insight: Isn't the relighting problem essentially light in motion, interacting with the scene? So why not directly model this as video generation?! We achieve the near-field relighting without 3D by fine-tuning an image-to-video model on a tiny but physical synthetic

1

0

4

Haven (Haiwen) Feng

@HavenFeng

23 days

🎬 GenLit introduces single image near-field relighting for the first time, meaning our added point lights can freely move within or out of the camera frame. Think of a tiny, glowing fairy light like "Tinker Bell" gracefully illuminating objects and casting realistic shadows! No.

1

0

3

Haven (Haiwen) Feng

@HavenFeng

24 days

🚨 Come see InterDyn at #CVPR2025!. We're showing how video generative models can simulate physical interactions without explicit simulator! 🌍🎬.📌 Poster #173. 🗓️ Saturday, June 14 | 🕥 10:30–12:30. 📍 ExHall D.🎤 Also catch Rick’s spotlight talk at Agents-in-Interaction.

Haven (Haiwen) Feng

@HavenFeng

4 months

🚀 Introducing InterDyn — our newly accepted CVPR work that explores controllable synthesis of interactive dynamics! Building upon powerful video diffusion models, InterDyn infers future motion and interactions directly from an input image and a dynamic control signal (e.g., a

0

2

33

Haven (Haiwen) Feng

@HavenFeng

28 days

RT @graceluo_: ✨New preprint: Dual-Process Image Generation! We distill *feedback from a VLM* into *feed-forward image generation*, at infe….

0

176

0

Haven (Haiwen) Feng

@HavenFeng

1 month

RT @sainingxie: Indeed. For text-to-image, @xichen_pan had a great summary supporting this decoupled design philosophy: "Render unto diffus….

0

35

0

Haven (Haiwen) Feng

@HavenFeng

2 months

So now we can collect robotics data without teleop??!.

Max Fu

@letian_fu

2 months

Tired of teleoperating your robots?.We built a way to scale robot datasets without teleop, dynamic simulation, or even robot hardware. Just one smartphone scan + one human hand demo video → thousands of diverse robot trajectories. Trainable by diffusion policy and VLA models

0

6

Haven (Haiwen) Feng

@HavenFeng

2 months

RT @ChungMinKim: Excited to introduce PyRoki ("Python Robot Kinematics"): easier IK, trajectory optimization, motion retargeting. with an….

0

165

0

Haven (Haiwen) Feng

@HavenFeng

2 months

RT @arthurallshire: our new system trains humanoid robots using data from cell phone videos, enabling skills such as climbing stairs and si….

0

112

0

Haven (Haiwen) Feng

@HavenFeng

2 months

It’s happening, Peter is presenting our IG-LLM at #ICLR2025. Can we consider the decades-long inverse graphics problem as a code generation task? 🤔. Come by poster #073 now to chat with us!

0

1

13

Haven (Haiwen) Feng

@HavenFeng

2 months

I will be presenting our SGP-Bench with @ItsTheZhen at #ICLR2025 🚀.Sat, 26th, 3:00–5:30 PM Singapore Time.Hall 3 + Hall 2B Poster #569. Can LLMs 'see' images directly via graphics code?!🧠🖼️ Come by our poster and let's chat!.

Weiyang Liu

@Besteuler

11 months

🚀 Excited to introduce our new work: SGP-Bench!. Can Large Language Models (LLMs) understand symbolic graphics programs? 🖥️ Imagine giving a model a symbolic graphics program like SVG or CAD and asking it to answer questions about the visual content without actually seeing the

0

5

18

Haven (Haiwen) Feng

@HavenFeng

2 months

RT @RickyTQChen: This ICLR is the best conference ever. Attendees are extremely friendly and cuddly. What do you mean this is the wrong….

0

27

0

Haven (Haiwen) Feng

@HavenFeng

3 months

Kudos to the amazing team with @junyi42(co-lead), @QianqianWang5, @yufei_ye, @pengcheng_147, @Michael_J_Black, @trevordarrell, and @akanazawa🖖.

0

4

Haven (Haiwen) Feng

@HavenFeng

3 months

Given two images I_i and I_j, instead of independently reconstructing each geometry like DUSt3R or MonST3R, one only needs two properly defined point maps—essentially reconstructing I_j’s 3D moment in I_i’s coordinates—then it yields "Simultanous 4D reconstruction and Tracking in.

1

5

Haven (Haiwen) Feng

@HavenFeng

3 months

At our core, we wonder: What is the minimal representation required to fully model the 4D world? Could it be done in a unified framework? It turns out to be remarkably simple! 🧵3/4.

1

0

5