Haoyang Weng
@ElijahGalahad
Followers
740
Following
646
Media
33
Statuses
193
Undergraduate @Tsinghua_IIIS | Intern @LeCARLab | Machine leaning for robotics | Applying for PhD 26 fall
Joined December 2021
We present HDMI, a simple and general framework for learning whole-body interaction skills directly from human videos โ no manual reward engineering, no task-specific pipelines. ๐ค 67 door traversals, 6 real-world tasks, 14 in simulation. ๐ https://t.co/ll44sWTZF4
24
148
745
Wow, zero gap visually?!
MimicKit now supports #IsaacLab! After many years with IsaacGym, it's time to upgrade. MimicKit has a simple Engine API that allows you to easily swap between different simulator backends. Which simulator would you like to see next?
1
0
7
@ElijahGalahad please consider my manifold optimizations I have recreated your spiral experiment with a 1,000,000 point cloud and make models that use geodesic topology instead of traditional 2d ai strats https://t.co/2RFOpmZeeW
1
1
1
Loss type isnโt the key variable, parameterization is. With same prediction space, v-, x-, and ฮต-losses merely reduce to different t-weighting. So the conclusion carries to all loss types. Checkout https://t.co/ohTZAoFlhr built on top of amazing @ZhiSu22.
github.com
Unofficial implementation of the toy example in JiT https://arxiv.org/abs/2511.13720 - EGalahad/jit_toy_example
1
0
13
Residual parameterizations can ๐๐ก๐๐ง๐ ๐ ๐ญ๐ก๐ ๐๐๐๐๐๐ญ๐ข๐ฏ๐ ๐ฉ๐ซ๐๐๐ข๐๐ญ๐ข๐จ๐ง ๐ญ๐๐ซ๐ ๐๐ญ. They determine whether the model must carry high-dimensional noise through the network, or whether it can operate purely on the low-dimensional data manifold.
3
0
8
You cannot discuss optimization without considering architecture. Parameterization changes everything: same objective can behave very differently. With a โcleverโ residual, ฮต -prediction can match x -prediction by reparameterizing the output head. https://t.co/lGJ8aqECS3
@YouJiacheng Yeah but I think the point is you want the network operate in a low dim space as the manifold, instead of high dim space as the input. Learning an identity means carrying the input all along the network. Inefficient and redundant given the discrepancy in input and manifold dim.
1
1
11
Many say ฮต-prediction and x-prediction are just reparameterizations and should behave the same. Actuallyโฆ ๐ข๐ญ ๐๐จ๐๐ฌ ๐๐ง๐ ๐ข๐ญ ๐๐จ๐๐ฌ๐งโ๐ญ. In my extended toy experiment: โข Vanilla MLP โ x wins โข Well-parameterized network โ ฮต works fine as well
9
23
180
You're my spiritual leader.
Zero teleoperation. Zero real-world data. โ Autonomous humanoid loco-manipulation in reality. Introducing VIRAL: Visual Sim-to-Real at Scale. We achieved 54 autonomous cycles (walk, stand, place, pick, turn) using a simple recipe: 1. RL 2. Simulation 3. GPUs Website:
1
0
7
Impressive long horizon, whole-body, generalizable dexterity! Congrats @sundayrobotics Curious about: 1. how costly is this map building process? 2. is the visual alignment done with a diffusion model or retargeting -> rendering -> inpainting stuff?
Today, we present a step-change in robotic AI @sundayrobotics. Introducing ACT-1: A frontier robot foundation model trained on zero robot data. - Ultra long-horizon tasks - Zero-shot generalization - Advanced dexterity ๐งต->
1
0
48
Introducing Gallant: Voxel Grid-based Humanoid Locomotion and Local-navigation across 3D Constrained Terrains ๐ค Project page: https://t.co/eC1ftH5ozx Arxiv: https://t.co/5K9sXDNQWv Gallant is, to our knowledge, the first system to run a single policy that handles full-space
1
33
186
A lot of people using HDMI codebase responded it trains really fast, e.g. the suitcase motion under one hour. These techniques are minimal but essential for its efficiency. A figure in the paper will never be intuitive as these videos. https://t.co/D3dzfIDswt
github.com
Contribute to LeCAR-Lab/HDMI development by creating an account on GitHub.
0
0
0
This is a #freelunch if you use teacher student training: just train teacher with residual actions and behavior clone a student without it.
1
0
0
As these clips show, using residual actions (left), the policy explores ๐น๐ผ๐ฐ๐ฎ๐น๐น๐ ๐ฎ๐ฟ๐ผ๐๐ป๐ฑ ๐๐ต๐ฒ ๐ฟ๐ฒ๐ณ๐ฒ๐ฟ๐ฒ๐ป๐ฐ๐ฒ. Without it (right), episode initialized from kneeling will ๐ฎ๐ฏ๐ฟ๐๐ฝ๐๐น๐ ๐ฝ๐ผ๐ฝ ๐๐ฝ, generating low-quality training samples.
1
0
0
#freelunch series 1: Residual Action Space for motion tracking Use ๐๐๐๐_๐๐๐๐๐๐ = ๐๐๐๐๐๐_๐๐๐๐ + ๐๐๐๐๐๐ instead of ๐๐๐๐๐๐๐_๐๐๐๐ + ๐๐๐๐๐๐ for exploration. This is especially useful for motion far from the default pose, e.g. kneeling.
1
1
6
I believe @physical_int choose these 3 demos on purpose to show everyone they are capable of all the iconic demos that other startups do: making coffee -> @sundayrobotics folding laundry -> @DynaRobotics building boxes -> @GeneralistAI Now the burdenโs on the rest.
11
10
177
Imagine moving a heavy object with a joystickโthrough a swarm of quadruped-arm robots. ๐น๏ธ decPLM: decentralized RL for multi-robot pinch-lift-move. โข No comms or rigid links โข Hierarchical RL + constellation reward โข 2โ N robots, simโreal ๐ https://t.co/BPwqHV0ngE
15
110
610
Universal retargeting for dexterous hands, humanoid, grounded in physics!
๐ธ๏ธ Introducing SPIDER โ Scalable Physics-Informed Dexterous Retargeting! A dynamically feasible, cross-embodiment retargeting framework for BOTH humanoids ๐ค and dexterous hands โ. From human motion โ sim โ real robots, at scale. ๐ Website: https://t.co/ieZfG2Q4L0 ๐งต 1/n
1
2
31
'Progress in robotics often feels slow day to day, but zoom out โ and itโs staggering.'
Jan 2024: one humanoid stood up in a CMU lab. 20 months later: a day in the life of a humanoid at NVIDIA. Neither @zhengyiluo nor I couldโve imagined where this journey would lead โ but what a ride itโs been. Progress in robotics often feels slow day to day, but zoom out โ and
1
0
8
Jan 2024: one humanoid stood up in a CMU lab. 20 months later: a day in the life of a humanoid at NVIDIA. Neither @zhengyiluo nor I couldโve imagined where this journey would lead โ but what a ride itโs been. Progress in robotics often feels slow day to day, but zoom out โ and
How do you give a humanoid the general motion capability? Not just single motions, but all motion? Introducing SONIC, our new work on supersizing motion tracking for natural humanoid control. We argue that motion tracking is the scalable foundation task for humanoids. So we
2
14
121
When @TairanHe99 accepted the PhD offer and I decided to work on humanoids in Aug 2023, I told him: learning-based humanoid whole-body control is one of the hardest control problems โ โnaiveโ sim2real just wonโt work โ you could spend your whole PhD on it. Yet @TairanHe99
Jan 2024: one humanoid stood up in a CMU lab. 20 months later: a day in the life of a humanoid at NVIDIA. Neither @zhengyiluo nor I couldโve imagined where this journey would lead โ but what a ride itโs been. Progress in robotics often feels slow day to day, but zoom out โ and
1
8
103