Yevgen Chebotar
@YevgenChebotar
Followers
2K
Following
101
Media
12
Statuses
42
Robotic foundation models @NVIDIA | VLAs, Reinforcement Learning & more 🤖 Previously @GoogleDeepMind and @Figure_robot
Joined March 2017
Excited to join the NVIDIA GEAR team to help build the next generation of open robotic foundation models!
13
2
151
We've made great progress on Vision-Language-Action Models for humanoids in our new Helix model! Check out the technical report for more details: https://t.co/muMeO2Log7
figure.ai
Figure was founded with the ambition to change the world.
Meet Helix, our in-house AI that reasons like a human Robotics won't get to the home without a step change in capabilities Our robots can now handle virtually any household item:
6
4
93
The path to VLAs lies through VLMs. A very nice intro for everyone interested in working with Vision-Language Models: https://t.co/pRpSebf3Tb
arxiv.org
Following the recent popularity of Large Language Models (LLMs), several attempts have been made to extend them to the visual domain. From having a visual assistant that could guide us through...
1
17
66
Congrats everyone, 170+ authors and contributors, great to see the robotic field coming together!
Our OpenX paper won best paper at ICRA! Congrats to all my co-authors! 🎉🎉 This is an ongoing effort, we recently added new datasets from the community that double the size of the OpenX dataset -- keep 'em coming! :) Check datasets & how to contribute: https://t.co/gX5Ir30s5B
0
1
22
Some personal updates! Excited to join the team @Figure_robot to help building AI for the robot age! 🤖
25
2
164
Turns out classification loss works surprisingly well for value-based RL, also some nice gains when used with Q-Transfomer!
Super simple code change to get value-based deep RL scale *much* better w/ big models across the board on Atari games, robotic manipulation w/ transformers, LLM + text games, & even Chess! Just use classification loss (i.e., cross entropy), not MSE!! https://t.co/0IHSgN4pBj🧵⬇️
0
3
15
RT-H learns a hierarchy all the way from high-level tasks through low-level “language motions” to robot actions! ✅ Improved performance and generalization through better data sharing ✅ Automated grounded “bottom-up” labeling ✅ Ability to intervene and correct with language
Is language capable of representing low-level *motions* of a robot? RT-Hierarchy learns an action hierarchy using motions described in language, like “move arm forward” or “close gripper” to improve policy learning. 📜: https://t.co/61KZfQNpkY 🏠: https://t.co/voIvzq5Mek (1/10)
1
3
36
Had a great time today with @YevgenChebotar and @QuanVng visiting @USCViterbi to give a talk on “Robot Learning in the Era of Foundation Models”. Slides out soon, packed with works from *just the past 5 months* 🤯 Thanks to @daniel_t_seita for hosting!
1
3
57
Presenting RT-2 poster at CoRL! https://t.co/fA39xewX5O
robotics-transformer2.github.io
Project page for RT-2
Pictures taken at RT-2 poster at @DannyDriess requests ; ) @YevgenChebotar We miss you @TianheYu CC @hausman_k
0
2
24
How many @CSatUSC grads does it take to create a breakthrough robot? Three, apparently! Congrats @YevgenChebotar @hausman_k and @ryancjulian who worked on Google DeepMind's revolutionary RT-2 AI model. Find out more ⤵️ https://t.co/D4HqA7d7zH
@gauravsukhatme @USCViterbi
viterbischool.usc.edu
USC computer science alumni worked on Google's "first-of-its-kind" robot AI model, the Robotic Transformer 2 (RT-2)
0
3
20
Exciting times for Robot Learning! 60 datasets from 22 different robots and 21 institutions combined in a single Open-X Embodiment data repository, resulting in over 1 million episodes and improved RT-X models! Amazing and a very important collaboration across the world! 🤖🌐
RT-X: generalist AI models lead to 50% improvement over RT-1 and 3x improvement over RT-2, our previous best models. 🔥🥳🧵 Project website: https://t.co/GAlvFdqwx5
0
1
11
Joint work with @QuanVng, @AlexIrpan, @hausman_k, @xf1280, @Yao__Lu, @aviral_kumar2, @TianheYu, @AlexHerzog001, @KarlPertsch, @keerthanpg, @julianibarz, @ofirnachum, @Kanishka_Rao, @chelseabfinn, @svlevine
1
1
9
Our real robot policies significantly improve upon RT-1 and other baselines when trained on limited amount of human demonstrations by leveraging autonomously collected negatives and dynamic programming properties of Q-learning.
1
0
15
By using autoregressive Bellman updates, conservative regularization, Monte Carlo and n-step returns, we are able to combine human demonstrations and autonomously collected data to learn multi-task language-conditioned policies from both, successful and failed examples.
1
0
17
Offline RL strikes back! In our new Q-Transformer paper, we introduce a scalable framework for offline reinforcement learning using Transformers and autoregressive Q-Learning to learn from mixed-quality datasets! Website and paper: https://t.co/SntNYC9Pk3 🧵
8
107
524
Excited to present RT-2, a large unified Vision-Language-Action model! By converting robot actions to strings, we can directly train large visual-language models to output actions while retaining their web-scale knowledge and generalization capabilities! https://t.co/27sBueV42q
Today, we announced 𝗥𝗧-𝟮: a first of its kind vision-language-action model to control robots. 🤖 It learns from both web and robotics data and translates this knowledge into generalised instructions. Find out more: https://t.co/UWAzrhTOJG
0
14
76
Very exited to announce our largest deep RL deployment to date: robots sorting trash end-to-end in real offices! https://t.co/tTASU1Fmgs (aka RLS) This project took a long time (started before SayCan/RT-1/other newer works) but the learnings from it have been really valuable.🧵
16
161
822
What happens when we train the largest vision-language model and add in robot experiences? The result is PaLM-E 🌴🤖, a 562-billion parameter, general-purpose, embodied visual-language generalist - across robotics, vision, and language. Website: https://t.co/ouMkeQiGr5
27
510
2K
Super excited to introduce SayCan ( https://t.co/NWyvPubhmE): 1st publication of a large effort we've been working on for 1+ years Robots ground large language models in reality by acting as their eyes and hands while LLMs help robots execute long, abstract language instructions
18
273
1K
In offline RL, adding data from other tasks can boost generalization but can surprisingly *hurt* performance We analyze this & develop a conservative data sharing approach to help fix it: https://t.co/kAK7UKPLQu w/ @TianheYu @aviral_kumar2 @YevgenChebotar @hausman_k @svlevine
5
23
162