HaoruXue Profile Banner
Haoru Xue Profile
Haoru Xue

@HaoruXue

Followers
2K
Following
725
Media
46
Statuses
203

PhD @berkeley_ai | intern @ NVIDIA GEAR | prev. @CMU_Robotics @LeCARLab | Robot Learning, Humanoids

San Francisco Bay Area
Joined December 2023
Don't wanna be here? Send us removal request.
@HaoruXue
Haoru Xue
12 days
🚀 Introducing LeVERB, the first 𝗹𝗮𝘁𝗲𝗻𝘁 𝘄𝗵𝗼𝗹𝗲-𝗯𝗼𝗱𝘆 𝗵𝘂𝗺𝗮𝗻𝗼𝗶𝗱 𝗩𝗟𝗔 (upper- & lower-body), trained on sim data and zero-shot deployed. Addressing interactive tasks: navigation, sitting, locomotion with verbal instruction. 🧵.
12
102
437
@HaoruXue
Haoru Xue
11 days
RT @TheHumanoidHub: LeVERB is a VLA framework for humanoid whole-body control, combining a vision-language model and a low-level controller….
0
25
0
@HaoruXue
Haoru Xue
11 days
RT @GuanyaShi: Executable action vocabulary naturally exists for manipulation VLA (e.g., end-effector pose). For building humanoid VLA, thi….
0
5
0
@HaoruXue
Haoru Xue
12 days
RT @_wenlixiao: "What cannot be measured cannot be managed. We first create LeVERB-Bench, a photorealistic, whole-body vision-language benc….
0
7
0
@HaoruXue
Haoru Xue
12 days
RT @TairanHe99: Simulation could give you so much more than you think before you do real-world teleop. Check out @HaoruXue's latest work on….
0
11
0
@HaoruXue
Haoru Xue
12 days
RT @liangpan_t: Latents serve as the interface between System 1 and System 2, rather than relying on explicit kinematic motions. This parad….
0
3
0
@HaoruXue
Haoru Xue
12 days
(9/9) Dive deeper 👉. Collaborators: @x_h_ucb @Dantong_Niu @qiayuanliao @tjomiii Jan Tommy Gravdahl @xbpeng4 @GuanyaShi @trevordarrell @KoushilSreenath Shankar Sastry.
0
0
14
@HaoruXue
Haoru Xue
12 days
(8/9) Current status: dynamics-level sim2real ✔️ vision sim2real - to be released. • LeVERB-Bench dataset is already open-sourced in LeRobot format.• Full code release is coming.
1
0
8
@HaoruXue
Haoru Xue
12 days
(7/9) Related inspiration: Helix, NaVILA, LangWBC, VBC, … each pushes latent or hierarchical VLA in its own niche (upper-body, legged nav, etc.). LeVERB adds whole body latent control plus an open benchmark for everyone to build on.
1
0
15
@HaoruXue
Haoru Xue
12 days
(6/9) Generalization: LeVERB sees “take a seat”, “sit down”, or “sit on blue chair” and knows they mean the same thing. It also reasons about space: if the chair is in front, it turns first, then sits.
1
2
12
@HaoruXue
Haoru Xue
12 days
(5/9) LeVERB sees 80% zero-shot success rate on simple visual navigation tasks, and 58.5% across the board. This is 7.8 times better than a naive hierarchical VLA implementation with no latent regularization, highlighting the unique challenge brought by async decoupled WBC loop.
Tweet media one
1
0
8
@HaoruXue
Haoru Xue
12 days
(4/9) Training data is the real bottleneck, so we built LeVERB-Bench: 154+ photorealistic sim scenes with heavy randomization—lighting, textures, clutter, camera angles. This diversity is what lets LeVERB generalize. Tasks: visual navigation, sitting, reaching, locomotion, etc.
1
2
13
@HaoruXue
Haoru Xue
12 days
(3/9) Two-system brain:.System 2 thinks at 10 Hz (vision + language). System 1 reacts at 50 Hz (balance + contacts). Slow reasoning + fast reflexes = stable, expressive whole-body control.
Tweet media one
1
0
14
@HaoruXue
Haoru Xue
12 days
(2/9) LeVERB instead learns a latent action space, shout out to PULSE and MaskedMimic. The high-level VL model outputs a latent “verb”, the low-level controller decodes it into joint motion (seperatedly trained). Result: a far richer expressive skill set.
Tweet media one
1
0
14
@HaoruXue
Haoru Xue
12 days
(1/9) Old approaches made humanoid robots follow hand-crafted action commands (like setting a walking speed or an arm pose) from the language module. This limited them to a small, predefined skill set and made complex whole-body motions hard to achieve.
Tweet media one
1
0
13
@HaoruXue
Haoru Xue
14 days
CYBER-TRIP 😎 Rolling down HWY 1 with @zhengyiluo @TairanHe99 @_wenlixiao @zi2865 and a G1 . See y’all at RSS!
Tweet media one
Tweet media two
Tweet media three
@zhengyiluo
Zhengyi “Zen” Luo
15 days
Nvidia GEAR RSS 2025 Squad Rolling Out
0
2
19
@HaoruXue
Haoru Xue
18 days
Impressive work. Lots of works this year shows good engineering can really demystify WBC. There is no more excuse for crappy policies. Next steps: making WBC policy more accessible, making it easier to interface with vision-language.
@C___eric417
Zixuan Chen
19 days
🚀Introducing GMT — a general motion tracking framework that enables high-fidelity motion tracking on humanoid robots by training a single policy from large, unstructured human motion datasets. 🤖A step toward general humanoid controllers. Project Website:
0
2
9
@HaoruXue
Haoru Xue
19 days
At @ycombinator AI Startup School Today 🤗 Excited for meetups
Tweet media one
Tweet media two
3
4
31
@HaoruXue
Haoru Xue
1 month
RT @li_yitang: 🤖Can a humanoid robot carry a full cup of beer without spilling while walking 🍺?. Hold My Beer !. Introducing Hold My Beer🍺:….
0
37
0
@HaoruXue
Haoru Xue
1 month
Career Update: I’m interning at NVIDIA GEAR Lab supervised by @DrJimFan and @yukez. Looking forward to frontier robot learning research with the magnificent team!
Tweet media one
9
3
217