Ruihan Yang
@RchalYang
Followers
2K
Following
688
Media
19
Statuses
189
Applied Scientist @ Amazon Frontier AI & Robotics (FAR) PhD from @UCSanDiego Robot Learning / Embodied AI
San Diego, CA
Joined July 2017
At IROS 2024 now. I’ll present our work HarmonicMM tomorrow 10AM at WeAT2 Also open to all kinds of discussion! Let me know if you’d like to chat!
How to tackle complex household tasks like opening doors, and cleaning tables in real? Introducing HarmonicMM: Our latest model seamlessly combines navigation and manipulation, enabling robots to tackle household tasks using only RGB visual observation and robot proprioception.
1
6
45
Unified multimodal models can generate text and images, but can they truly reason across modalities? 🎨 Introducing ROVER, the first benchmark that evaluates reciprocal cross-modal reasoning in unified models, the next frontier of omnimodal intelligence. 🌐 Project:
5
27
285
Residual RL for pretrained policies at ease in real world by amazing @larsankile . Check it out!
How can we enable finetuning of humanoid manipulation policies, directly in the real world? In our new paper, Residual Off-Policy RL for Finetuning BC Policies, we demonstrate real-world RL on a bimanual humanoid with 5-fingered hands (29 DoF) and improve pre-trained policies
0
1
29
Over the past few years, a lot of progress (not just robot learning) has come from working on somewhat similar hardware — making it much easier to share knowledge. Of course, if NVIDIA is shipping actually humanoid, i would like to see how it works.
1
0
18
We're hiring interns (and full-times) all year long! Please email me if interested.
41
86
2K
My personal opinion: mobile aloha (Bimanual + Wheel) / Vega (Bimanual + dexhand + wheel) / Digit (Bimanual + Wheel) / Optimus (obviously) are humanoids.
1
0
7
Turns out, when we discuss “humanoid robot” everyone’s picturing something totally different. So I made this figure, and next time, i'll show this, before discussion.
15
19
195
More can be found on: Website: https://t.co/3l1ncleUjQ Papers: https://t.co/fcUsdo9ymZ Great collaboration with Qinxi Yu, Yecheng Wu, @Hi_Im_RuiYan , BoruiLi, @anjjei , @xyz2maureen @FangYunhaoX, @xuxin_cheng ,@RogerQiu_42, @yin_hongxu , @Sifei30488L , @songhan_mit , @Yao__Lu
arxiv.org
Real robot data collection for imitation learning has led to significant advancements in robotic manipulation. However, the requirement for robot hardware in the process fundamentally constrains...
0
2
20
We evaluate EgoVLA on our Ego Humanoid Manipulation Benchmark * Human pretraining improves performance across both short- & long-horizon tasks * Fine-tuned EgoVLA outperforms baselines, especially on challenging, multi-step behaviors * Pretraining boosts generalization to
1
1
9
To enable reproducible, scalable evaluation, we introduce Ego Humanoid Manipulation Benchmark — a diverse humanoid manipulation benchmark using Isaac Lab and a testbed for manipulation policy generalization. • 12 tasks: from atomic to multi-stage skills • 25 visual background
1
1
7
At its core, EgoVLA leverages a unified human-robot action space built on the MANO hand model. We retarget robot hand motions into MANO space, allowing human and robot actions to be represented identically. During deployment, EgoVLA predicts MANO wrist + hand motion from video.
2
0
7
EgoVLA learns manipulation by predicting future wrist & hand motion from diverse egocentric human videos across different backgrounds and tasks. It uses a vision-language backbone (NVILA-2B) and an action head to model both perception and control: * Inputs: RGB history, language
1
2
14
How can we leverage diverse human videos to improve robot manipulation? Excited to introduce EgoVLA — a Vision-Language-Action model trained on egocentric human videos by explicitly modeling wrist & hand motion. We build a shared action space between humans and robots, enabling
6
72
491
that means you are so lucky to deeply understand multiple problems during your phd.
ummm… As a robotics PhD student, I’m genuinely worried that the problem I find important now will be solved in the next 2 years—by MORE DATA, without any need to understand the underlying structure. And this happens in many areas😂
0
0
5
When it comes to scaling data, it’s not just about scale—it’s also about distribution. Leveraging generative models, even simple ones, can help improve both. Great work led by @jianglong_ye & @kaylee_keyi!
How to generate billion-scale manipulation demonstrations easily? Let us leverage generative models! 🤖✨ We introduce Dex1B, a framework that generates 1 BILLION diverse dexterous hand demonstrations for both grasping 🖐️and articulation 💻 tasks using a simple C-VAE model.
1
2
7
Thank you @xiaolonw for all the support and guidance over the past six years! It’s been a truly transformative experience, and I’m so grateful for everything I’ve learned along the way. Hard to believe this chapter is coming to a close.
Congratulations to the graduation of @Jerry_XU_Jiarui @JitengMu @RchalYang @YinboChen ! I am excited for their future journeys in industry: Jiarui -> OpenAI Jiteng -> Adobe Ruihan -> Amazon Yinbo -> OpenAI
7
1
57
For years, I’ve been tuning parameters for robot designs and controllers on specific tasks. Now we can automate this on dataset-scale. Introducing Co-Design of Soft Gripper with Neural Physics - a soft gripper trained in simulation to deform while handling load.
7
38
131
Great progress by Optimus
0
0
2
Our robotics team will be at ICRA next week in Atlanta! Having started a new research team at Amazon building robot foundation models, we're hiring across all levels, full-time or intern, and across both SW and Research roles. Ping me at drockyd@amazon.com and let's have a chat!
2
21
201