Kaustubh Sridhar
@_k_sridhar
Followers
1K
Following
39K
Media
71
Statuses
2K
Research Scientist @GoogleDeepMind. Prev: AI+Robotics PhD @Penn. Undergrad @iitbombay
Joined April 2013
Robot AI brains, aka Vision-Language-Action models, cannot adapt to new tasks as easily as LLMs like Gemini, ChatGPT, or Grok. LLMs can adapt quickly with their in-context learning (ICL) capabilities. But can we inject ICL abilities into a pre-trained VLA like pi0? Yes!
6
31
231
All the bad stocks in my portfolio are just one Indian CEO away from being ten baggers
80
179
4K
For point 1 from @KyleVedder , it’s important to also keep point 1 from @demishassabis in mind:
When asked to speak without exaggeration about AI advancements over the next 12 months, Demis Hassabis(@demishassabis) highlighted the following three points: 1. A lot of progress in multimodality: We are currently gaining very interesting synergy effects as models become
1
0
2
Kyle has great predictions but overly conservative timelines. :)
Link to full blogpost: https://t.co/3wRgsubZPi
2
0
7
I’ll be at neurips in San Diego from Wed-Fri. Get in touch if you want to talk about world models, general agents, robotics, or GDM.
8
4
91
this may seem contradictory to scientific principles, but more often than you might imagine, you believe not because of what you see; you see because of what you believe.
Ilya on research taste: “One thing that guides me personally is an aesthetic of how AI should be by thinking about how people are. There's no room for ugliness. It's just beauty, simplicity, elegance, with correct inspiration from the brain. The more they are present, the more
7
13
163
“Amateur photograph from 1998 of a middle-aged artist copying an image by hand from a computer screen to an oil painting on stretched canvas, but the image is itself the photo of the artist painting the recursive image.” Nano Banana Pro.
253
1K
12K
My group @Princeton is hiring! We are looking for strong postdoc and PhD candidates to join our quest for intelligent robots in open-world environments. Read more below and get in touch 🤖🐅🧡 https://t.co/7o35pwPZCz
14
143
859
you gotta give Gemini a serious try same prompts, Gemini found the one thing I wanted in 1/3 the time, while ChatGPT took >3 mins, gave me 7 results every time I did the comparison the last two days, both were equal or G was better
okay, time to go full gemini for 1-2 weeks (with @AmpCode as Gemini CLI replacement cause G has waitlisted theirs)
11
4
146
Nano Banana Pro, released this morning, is clearly the best image generation model. Superb instruction following, plus it can generate full infographics (with correct spelling and properly rendered text!) from a short prompt based on running extra searches
simonwillison.net
Hot on the heels of Tuesday’s Gemini 3 Pro release, today it’s Nano Banana Pro, also known as Gemini 3 Pro Image. I’ve had a few days of preview access …
25
81
740
You went 🍌🍌 for Nano Banana. Now, meet Nano Banana Pro. It’s SOTA for image generation + editing with more advanced world knowledge, text rendering, precision + controls. Built on Gemini 3, it’s really good at complex infographics - much like how engineers see the world:)
828
2K
24K
Thinking (test-time compute) in pixel space... 🍌 Pro tip: always peek at the thoughts if you use AI Studio. Watching the model think in pictures is really fun!
21
81
703
Rolling out today we are launching Nano Banana Pro, the world’s best image model built to move beyond casual creation and into a new era of studio-quality, functional design. Nano Banana Pro enables a new level of precision and creative control, transforming the way you bring
133
439
3K
I expect this to be a game changer for me
blog.google
Today, we are introducing Google Scholar Labs, a new feature that explores how generative AI can transform the process of answering detailed scholarly research questions…
6
34
457
We asked AI Mode in Search how a basketball player's 3 point shot relates to the quadratic equation. With Gemini 3's new generative UI capabilities, it created an interactive visualization to help bring this concept to life. Try it out with the Gemini 3 drop down in AI Mode and
47
144
943
Introducing Yutori Navigator 31 years ago, the modern web era began with Netscape Navigator. Today, we’re introducing Yutori Navigator — a web agent that autonomously navigates websites on its own cloud browser to complete tasks for you. Navigator achieves pareto-domination
28
47
246
Just how significant is the jump with Gemini 3? We just released a new leaderboard to track AI developments. Gemini 3 is the largest leap in a long time.
31
84
553
AGI
0
0
4
I had #EarlyAccess to Gemini 3.0 for about 2 days (thanks to @OfficialLoganK & the aistudio folks). Here we see gpt-5.1-thinking (left) vs gemini-3.0 (right) building the xbox controller in Minecraft.
17
12
268