Dhruv Shah @shahdhruv_ X Profile

Dhruv Shah

@shahdhruv_

Followers

5K

Following

6K

Media

126

Statuses

921

professor @Princeton | researcher @GoogleDeepMind

https://t.co/eLy793FxLQ

San Francisco, CA

Joined April 2012

Don't wanna be here? Send us removal request.

Dhruv Shah

@shahdhruv_

4 months

Yesterday, we live demo-ed a “generalist” VLA for (I think) the first time ever to a broad audience @RoboticsSciSys. Bring any object. Ask anything. New environment, new instructions, no fine-tuning. Just impeccable vibes! ✨

7

29

338

Dhruv Shah

@shahdhruv_

4 days

Excited to share our new work on making VLAs omnimodal — condition on multiple different modalities (one at a time or all at once)! It allows us to train on more data than any single-modality model, and outperforms any such model: more modalities = more data = better models! 🚀

noriaki_hirose

@NoriakiHirose

25 days

We trained OmniVLA, a robotic foundation model for navigation conditioned on language, goal poses, and images. Initialized with OpenVLA, it leverages Internet-scale knowledge for strong OOD performance. Great collaboration with @CatGlossop, @shahdhruv_, and @svlevine.

4

23

139

Konstantinos Bousmalis

@bousmalis

18 days

You have to watch this! For years now, I've been looking for signs of nontrivial zero-shot transfer across seen embodiments. When I saw the Alohas unhang tools from a wall used only on our Frankas I knew we had it! Gemini Robotics 1.5 is the first VLA to achieve such transfer!!

22

51

336

Dhruv Shah

@shahdhruv_

22 days

We are @corl_conf with robots and another interactive VLA demo! Come to the @GoogleDeepMind booth to check out the Gemini Robotics 1.5 VLA in action on Frankas and Aloha — bring your objects and ask anything 🦾

1

13

77

Dhruv Shah

@shahdhruv_

23 days

I'll be speaking at the Eval&Deploy workshop today @corl_conf at 12.05pm, and be on a couple of panels (Eval&Deploy 3pm, RemembeRL 3.45pm). Come ask some fun/spicy/hard questions!

0

27

Dhruv Shah

@shahdhruv_

24 days

Check out our tech report for more details and rigorous evaluation. We are hiring! Come to the booth to see us, or our models :) PDF: https://t.co/ltPSKN51rz

0

1

14

Dhruv Shah

@shahdhruv_

24 days

This result was very surprising! Not only can the model explain its actions and plan future steps, it can actually detect failures and re-plan to be more persistent and clever. Once you start thinking, there's no going back!

1

7

Dhruv Shah

@shahdhruv_

24 days

🤔With Thinking turned on, our VLA can interleave textual reasoning (substep planning, success detection, error mitigation etc.) with raw actions, all in the same model! Result: Training thoughts and actions end-to-end enables better actions!

1

2

Dhruv Shah

@shahdhruv_

24 days

This goes beyond simply training on cross-embodiment data, learning embodiment-agnostic visuomotor behaviors. The same pre-trained ckpt can solve the same task on many robots! Multi-embodiment training + Motion Transfer = improvements across the board and better generalization

1

6

Dhruv Shah

@shahdhruv_

24 days

🔁This is the most impressive transfer result I've seen: raw images to raw actions, across robots with different cameras and action spaces and ... We use a novel mechanism called Motion Transfer to learn across pre-training embodiments: no explicit alignment required!

1

6

58

Dhruv Shah

@shahdhruv_

24 days

Excited to share the next gen of Gemini Robotics! I want to highlight two key pre-training advances that fundamentally changed how I think about VLAs. 🔁 Motion Transfer: Out-of-box transfer of skills across embodiments 🤔 Interleaved Thinking: Real-time reasoning about actions

Google DeepMind

@GoogleDeepMind

24 days

We’re making robots more capable than ever in the physical world. 🤖 Gemini Robotics 1.5 is a levelled up agentic system that can reason better, plan ahead, use digital tools such as @Google Search, interact with humans and much more. Here’s how it works 🧵

2

13

55

Sergey Levine

@svlevine

25 days

A new VLA for navigation that can take in goal images, positions, and language, and exhibits some pretty neat emergent language following!

noriaki_hirose

@NoriakiHirose

25 days

We trained OmniVLA, a robotic foundation model for navigation conditioned on language, goal poses, and images. Initialized with OpenVLA, it leverages Internet-scale knowledge for strong OOD performance. Great collaboration with @CatGlossop, @shahdhruv_, and @svlevine.

6

46

370

Dhruv Shah

@shahdhruv_

27 days

What we really need more of is live demos :)

Matt Freed

@_mattfreed

29 days

The robotics vibe is not a quick clips launch video - it’s raw uncut footage of things working end to end reliably

0

1

27

Sergey Levine

@svlevine

2 months

Language following is a tough problem for VLAs: while these models can follow complex language, in practice getting datasets that enable language following is hard. We developed a method to counterfactually and automatically label data to improve language following! 🧵👇

7

69

416

Ayzaan Wahid

@ayzwah

4 months

We took a robot to RSS in LA running our new Gemini Robotics On-Device VLA model. People interacted with the model with new objects and instructions in a brand new environment and the results were amazing!

3

18

139

Dhruv Shah

@shahdhruv_

4 months

Thanks for cheering @GautamSalhotra @Ishika_S_ @RussTedrake @_abraranwar and our favorite Prof. Rudy who is still not on Twitter!

1

0

8

Dhruv Shah

@shahdhruv_

4 months

This is the Gemini Robotics On-Device VLA that runs on a single GPU: and you can apply for access today! Shoutout to @ayzwah @debidatta @SudeepDasari @ashwinb96 @xjygr08 @xiao_ted @sippeyxp @jackyliang42 @TonyWentaoYuan @ColinearDevin and everyone else at GDM Robotics for making

Dhruv Shah

@shahdhruv_

4 months

Excited to release Gemini Robotics On-Device and bunch of goodies today 🍬 on-device VLA that you can run on a GPU 🍬 open-source MuJoCo sim (& benchmark) for bimanual dexterity 🍬 broadening access to these models to academics and developers https://t.co/mSjXTLuOeu

1

0

15

Kuan Fang

@KuanFang

4 months

Join us for a full day of exciting talks and discussions on learning representations for robotic intelligence! Learned Robot Representations (RoboReps) Workshop @ #RSS2025 📍 Location: SGM 124 📅 Full schedule: https://t.co/25B7CElgjk

1

8

32

Dhruv Shah

@shahdhruv_

4 months

Huh. @grok is this true? https://t.co/XNcYWrhoZb

4

1

25