
Kaustubh Sridhar
@_k_sridhar
Followers
1K
Following
36K
Media
70
Statuses
2K
Research Scientist @GoogleDeepMind. Prev: AI+Robotics PhD @Penn. Undergrad @iitbombay
Joined April 2013
Robot AI brains, aka Vision-Language-Action models, cannot adapt to new tasks as easily as LLMs like Gemini, ChatGPT, or Grok. LLMs can adapt quickly with their in-context learning (ICL) capabilities. But can we inject ICL abilities into a pre-trained VLA like pi0? Yes!
6
30
229
Come to join the incredible Google DeepMind Robotics team! We are looking for researchers working on world modeling for robotics! Talk to @xf1280 at #ICCV!
The mission is calling! #Google DeepMind Robotics is looking for talented researchers to join our team and help build the new frontier of robotics intelligence through world modeling.
0
1
11
We’re hiring to build the frontier of robotics and world models!
The mission is calling! #Google DeepMind Robotics is looking for talented researchers to join our team and help build the new frontier of robotics intelligence through world modeling.
0
0
2
The mission is calling! #Google DeepMind Robotics is looking for talented researchers to join our team and help build the new frontier of robotics intelligence through world modeling.
job-boards.greenhouse.io
Mountain View, California, US
7
18
189
Rome wasn’t built in a day, but this explainer was. With first + last frame references using Veo 3.1 in invideo we created this immersive video with precise camera controls and continuity.
119
445
4K
Despite all the flashy dancing and kung-fu humanoid videos, real ones know that general manipulation is *the* bottleneck that no one has cracked. Check out some of the progress from our team on humanoid VLAs. Same pretrained checkpoint that controls Alohas and Bi-arm Frankas!
The funnest part about being a roboticist is that you get to play with robots and call it “work”. Here’s Apollo w/ GR 1.5 trying to grab from my unyielding hand, like a toddler, a test of manipulation generalization. Cannot believe this is the worst humanoids will ever be!
6
11
160
@kimmonismus This is a very significant discovery and only a small glimpse of what’s to come! What most people don’t yet know is that the Google co-scientist I’m testing is even better than this model. There is a reason I’m so enthusiastic about AI transforming & rapidly accelerating science.
8
11
222
Introducing Veo 3.1 and Veo 3.1 Fast, our latest state of the art video models with: - richer native audio - better cinematic styles - reference to video - transitions between frames - video extensions
194
333
4K
An exciting milestone for AI in science: Our C2S-Scale 27B foundation model, built with @Yale and based on Gemma, generated a novel hypothesis about cancer cellular behavior, which scientists experimentally validated in living cells. With more preclinical and clinical tests,
552
3K
22K
Interested to work on generalist robots that are safe, trustworthy, and capable? 🤖 📢 My group at @Princeton is looking for PhD students and postdocs this cycle! PhD: please apply through the MAE department (Dec. 1). Postdocs: please email me directly!
4
30
181
Google is the org at Hugging Face with the most downloads 🤗
New blog post analyzing the top 50 entities with the most downloaded models on @huggingface 🤗! The purpose here is to get an idea of the profile of the models with the greatest impact in open source (we are not interested in closed models here!). Some key findings:
6
18
205
Fantastic to see Genie 3, our state-of-the-art world model, featured in @TIME's 2025 Best Inventions. From a single image or text prompt to an entire playable world, it’s the future of AI and entertainment. So proud of @jparkerholder @shlomifruchter & the team - huge congrats!
We’re proud to announce that Genie 3 has been named one of @TIME’s Best Inventions of 2025. Genie 3 is our groundbreaking world model capable of generating interactive, playable environments from text or image prompts. Find out more → https://t.co/bv1gZaWYtd
38
89
865
Super cool to see Genie 3 recognized as one of @TIME's Best Inventions of 2025!! Congrats to the incredible team for making it possible :)
We’re proud to announce that Genie 3 has been named one of @TIME’s Best Inventions of 2025. Genie 3 is our groundbreaking world model capable of generating interactive, playable environments from text or image prompts. Find out more → https://t.co/bv1gZaWYtd
3
9
153
Very cool to see this new state-of-the-art result on FrontierMath achieved by the #DeepThink IMO Gold model that we built 3 months (a long time) ago :) The fact that is evaluated externally using a publicly available API is a strong testament that what we built generalizes beyond
We evaluated Gemini 2.5 Deep Think on FrontierMath. There is no API, so we ran it manually. The results: a new record! We also conducted a more holistic evaluation of its math capabilities. 🧵
15
16
249
We’re proud to announce that Genie 3 has been named one of @TIME’s Best Inventions of 2025. Genie 3 is our groundbreaking world model capable of generating interactive, playable environments from text or image prompts. Find out more → https://t.co/bv1gZaWYtd
99
268
2K
🪄#Gemini Robotics-ER 1.5 use cases (2) 🥮 🌕🎉#MidAutumnFestival is tomorrow (Oct 6). Slicing #mooncake into 5 equal pieces to share can be tricky. 🍽️With ER model in #aistudio, you can "Show me a visual overlay to tell me how to split the mooncake into equal 5 pieces"
2
4
31
You have to watch this! For years now, I've been looking for signs of nontrivial zero-shot transfer across seen embodiments. When I saw the Alohas unhang tools from a wall used only on our Frankas I knew we had it! Gemini Robotics 1.5 is the first VLA to achieve such transfer!!
22
51
336
It is very difficult to properly convey “zero-shot generalization”, it’s a very overloaded and overabused term these days! But let me try to add some color to why so many of my colleagues have been so shocked at what the Gemini Robotics 1.5 VLA has been doing… 🎨 The whole
You have to watch this! For years now, I've been looking for signs of nontrivial zero-shot transfer across seen embodiments. When I saw the Alohas unhang tools from a wall used only on our Frankas I knew we had it! Gemini Robotics 1.5 is the first VLA to achieve such transfer!!
6
14
114
Gemini Robotics 1.5 let's break it down: ✅ This is an agentic system powered by 2 models: the high level brain VLM ( Gemini Robotics-ER 1.5) and a new VLA (Gemini Robotics 1.5) ✨ ✅ GR1.5 can now think while acting, enabling more generalization and making robots more
6
19
149