Junchen Liu
@JunchenLiu77
Followers
295
Following
176
Media
4
Statuses
46
PhD student @UofT @VectorInst.
Toronto
Joined September 2022
1/ #NVIDIAGTC We’re excited to share that ChronoEdit-14B model and 8-step Distillation LoRA (4s/image on H100) are released today. 🤗 Model https://t.co/X3diGAY42p 🤗 Demo https://t.co/2xfiRo6wij 💡ChronoEdit brings temporal reasoning to image editing task. It achieves STOA
5
35
106
Checkout our latest work on Gaussian Splatting for LiDAR with 3DGUT!
[1/N] Excited to introduce "SimULi: Real-Time LiDAR and Camera Simulation with Unscented Transforms." We extend 3DGUT with LiDAR support and render a wide range of sensors 10-20x faster than ray tracing and 1.5-10x faster than prior rasterization work. https://t.co/Q9J2T5cjLj
2
30
381
Please check out this tool! It enables easy generation of dynamic G-buffers and data under various lighting conditions. It’s used for data generation in DiffusionRenderer, UniRelight, and LuxDiT. 🚀
Just dropped a Blender-based data generation tool that can be used to render randomly composed synthetic scenes with all G-Buffer attributes. 😋
1
9
76
Excited to share what I’ve been working on since joining @xai — Grok Imagine v0.9. You can try it at https://t.co/lhQN6rSpAr . Looking back on the past two months, a few lessons really stuck with me: (1) have faith in scaling (cautiously); (2) solid, well-tracked
Introducing Imagine v0.9, our new video generation model with massive upgrades from v0.1 in visual quality, motion, audio generation, and more. Now available for free on all our products: https://t.co/2DPEzEZ03e
25
24
307
Zero-shot video reasoning (chain-of-frames) isn’t just for Veo3 — open-source models can understand and edit too! 🕹️ ChronoEdit brings temporal reasoning to image editing. 🔗 https://t.co/6pyTDfzmGH
🕹️We are excited to introduce "ChronoEdit: Towards Temporal Reasoning for Image Editing and World Simulation" ChronoEdit reframes image editing as a video generation task to encourage temporal consistency. It leverages a temporal reasoning stage that denoises with “video
1
18
125
🕹️We are excited to introduce "ChronoEdit: Towards Temporal Reasoning for Image Editing and World Simulation" ChronoEdit reframes image editing as a video generation task to encourage temporal consistency. It leverages a temporal reasoning stage that denoises with “video
6
37
140
🎉 Excited to share our latest work in streaming long video generation—say hi to #RollingForcing! This tool lets you create multi-minute videos in real-time, with minimal error accumulation. We’re fired up to think it could be a fundamental component of interactive #WorldModel
2
3
21
📢 SceneComp @ ICCV 2025 🏝️ 🌎 Generative Scene Completion for Immersive Worlds 🛠️ Reconstruct what you know AND 🪄 Generate what you don’t! 🙌 Meet our speakers @angelaqdai, @holynski_, @jampani_varun, @ZGojcic @taiyasaki, Peter Kontschieder https://t.co/LvONYIK3dz
#ICCV2025
2
17
53
📢 Lyra: Generative 3D Scene Reconstruction via Video Diffusion Model Self-Distillation Got only one or a few images and wondering if recovering the 3D environment is a reconstruction or generation problem? Why not do it with a generative reconstruction model! We show that a
19
71
251
Should robots have eyeballs? Human eyes move constantly and use variable resolution to actively gather visual details. In EyeRobot ( https://t.co/iSL7ZLZcHu) we train a robot eyeball entirely with RL: eye movements emerge from experience driven by task-driven rewards.
8
56
272
Nvidia just released Lyra on Hugging Face Generative 3D Scene Reconstruction via Video Diffusion Model Self-Distillation TL;DR: Feed-forward 3D and 4D scene generation from a single image/video trained with synthetic data generated by a camera-controlled video diffusion model
17
96
547
It’s live! 🎉 🗺️ It was very fun working with @Nik__V__ and our team @Meta for this release. I’m excited to see how the community uses it. 😃
Meet MapAnything – a transformer that directly regresses factored metric 3D scene geometry (from images, calibration, poses, or depth) in an end-to-end way. No pipelines, no extra stages. Just 3D geometry & cameras, straight from any type of input, delivering new state-of-the-art
0
3
31
💡 Introducing LuxDiT: a diffusion transformer (DiT) that estimates realistic scene lighting from a single image or video. It produces accurate HDR environment maps, addressing a long-standing challenge in computer vision. 🔗Paper: https://t.co/6cW6WlREBl
3
58
275
Every lens leaves a blur signature—a hidden fingerprint in every photo. In our new #TPAMI paper, we show how to learn it fast (5 mins of capture!) with Lens Blur Fields ✨ With it, we can tell apart ‘identical’ phones by their optics, deblur images, and render realistic blurs.
157
711
7K
3D annotation has never been easier!
[1/N] 🎥 We've made available a powerful spatial AI tool named ViPE: Video Pose Engine, to recover camera motion, intrinsics, and dense metric depth from casual videos! Running at 3–5 FPS, ViPE handles cinematic shots, dashcams, and even 360° panoramas. 🔗 https://t.co/1mGDxwgYJt
0
0
4
Genie3 is like magic! Curious the best way to add viewpoint conditioning signal into transformer? Check this out 👉
liruilong.cn
We introduce PRoPE, a method for conditioning image tokens based on corresponding camera parameters in transformers for multiview vision tasks.
Another one. Already a powerful painting, but moving around it yourself gives a totally different feeling. Jacques Louis David's "The Death of Socrates" => #Genie3
1
4
66
A real-time interactive model with almost perfect 3D consistency and long-term memory!!
Another one. Already a powerful painting, but moving around it yourself gives a totally different feeling. Jacques Louis David's "The Death of Socrates" => #Genie3
0
0
1
"Cameras as Relative Positional Encoding" TLDR: comparison for conditioning transformers on cameras: token-level raymap, attention-level relative pose encodings, a (new) relative encoding Projective Positional Encoding -> camera frustums, (int|ext)insics for relative pos encoding
2
51
465
Excited to share Flow Matching Policy Gradients: expressive RL policies trained from rewards using flow matching. It’s an easy, drop-in replacement for Gaussian PPO on control tasks.
8
206
1K