Nikhil Parthasarathy
@nikparth1
Followers
792
Following
1K
Media
27
Statuses
478
Research Scientist @GoogleDeepMind interested in AI for science, multimodal learning, and data curation. PhD from the Simoncelli lab @NYU_CNS. BS/MS @Stanford.
Joined December 2017
🚀Excited to share that #Gemini 3 Flash can do code execution on images to zoom, count, and annotate visual inputs! The model can choose when to write code to: 🔍 Zoom & Inspect: Detect when details are too small and zoom-in. 🧮 Compute Visually: Run multi-step calculations
6
15
84
I see this take more often than I expect. Of course engineering is necessary, but so much of modern AI systems is built on published fundamental research work: Transformers, GPT, CLIP, RLHF, CoT, Diffusion etc... A lot from industry, yes, but ridiculing research is just silly
@leonpalafox How about you keep writing your papers for the academy as a researcher with lots of paper credentials, and us engineers will keep designing and building systems.
0
0
2
A SINGLE encoder + decoder for all the 4D tasks! We release 🎯 D4RT (Dynamic 4D Reconstruction and Tracking). 📍 A simple, unified interface for 3D tracking, depth, and pose 🌟 SOTA results on 4D reconstruction & tracking 🚀 Up to 100x faster pose estimation than prior works
13
43
264
Awesome work from a group of my colleagues- They've achieved SoTA 4D reconstruction with incredibly efficient inference! Check out the post for more details!
Thrilled to release 🎯 D4RT (Dynamic 4D Reconstruction and Tracking)! 🌟 State-of-the-art results on 4D reconstruction & tracking benchmarks 🚀 Up to 300x faster tracking and 100x faster pose estimation than prior works 📍 A simple, unified interface for tracking, depth, and
0
0
6
Excited to chat shortly with @jmin__cho, @sainingxie, and @RanjayKrishna at the DCVLR workshop at #NeurIPS2025! If you're interested in multimodal reasoning and data curation, stop by at Upper Level Ballroom 6DE!
Soon - I will chat about multimodal AI with @nikparth1 @sainingxie @RanjayKrishna at DCVLR workshop! Location: Upper Level Ballroom 6DE
0
2
14
Will be at #NeurIPS2025 Dec 3-7! If you're interested in AI for science, multimodal learning, video understanding, or data curation and want to chat feel free to DM!
4
1
41
As the owner/maintainer of the Erdős problems website, a thread with some comments on this solution to #124: 1) This is a nice proof, which was provided by the AI from the formal statement with no human involvement and then formalised in Lean. This is already impressive!
We are on the cusp of a profound change in the field of mathematics. Vibe proving is here. Aristotle from @HarmonicMath just proved Erdos Problem #124 in @leanprover, all by itself. This problem has been open for nearly 30 years since conjectured in the paper “Complete sequences
33
118
1K
Olivier and co. are awesome. Highly recommend reaching out to him if you're interested!
I'm thrilled to announce what I've been building since leaving GDM: @cursive_ai, a new foundation model and infra company unlocking real-time, generative software. We've identified GenAI's long latencies as a critical bottleneck towards its widespread adoption, and taken a huge
0
0
5
Every time I see an AI system show some new capability that wows me, the main driving force is this. This isn't to say algorithmic improvements don't matter or models don't generalize, but w/o high quality data in the domain you care about, nothing will magically "emerge".
0
0
4
Thinking (test-time compute) in pixel space... 🍌 Pro tip: always peek at the thoughts if you use AI Studio. Watching the model think in pictures is really fun!
21
81
698
Gemini 3 Pro is a pretty cool model. The general advances in multimodal reasoning and coding are great but the most interesting part might actually be the science & medicine advances it enables. We'll share more soon :) Congrats to everyone at @GoogleDeepMind @GoogleAI!
9
11
290
This is Gemini 3: our most intelligent model that helps you learn, build and plan anything. It comes with state-of-the-art reasoning capabilities, world-leading multimodal understanding, and enables new agentic coding experiences. 🧵
219
1K
7K
João is great! If you're interested in the future of video models reach out to him!
I'm looking for a student researcher to work with me at Google DeepMind in London, preferably starting early next year -- topics will be around novel video model architectures / learning from a single video stream / representation learning .
1
0
7
Was just thinking about this today in a totally random context- I want an AI to help dial in new espresso beans but after giving all the data I could think of, I realized there is far more information in the *taste* of the shot that I could never fully specify via language
0
0
2
Unfortunately I can't join in person but if you're at ICCV check out our Perception Test workshop in Ballroom B at 9AM! We have a great lineup of speakers - see Shiry's thread for more details! @ICCVConference
Join us TODAY for the 3rd Perception Test Challenge https://t.co/DVHQFjkyuA
@ICCV2025! Ballroom B, Full day Amazing lineup of speakers: @farhadi, @AlisonGopnik, Phlipp Krahenbul, @phillip_isola
0
0
2
Thanks @ducha_aiki for the repost of our work! I'll post a standalone thread myself on more of the fun details soon!
LayerLock: Non-collapsing Representation Learning with Progressive Freezing @goker_erdogan @nikparth1 Catalin Ionescu @drewAhudson @AlexLerchner Andrew Zisserman, Mehdi Sajjadi @joaocarreira tl;dr: if 1st layer already converged -- freeze it. https://t.co/MBX7AT7UdJ
0
0
4
During Thanksgiving break last year, our AI co-scientist team @GoogleDeepMind @GoogleResearch - @Mysiak @alan_karthi met Prof @jrpenades @CostaT_Lab of @ImperialCollege. They were nearing a breakthrough on how bacteria share resistance genes and proposed a test for our AI
16
104
659
The Perception Test challenge at next month's #ICCV2025 now has an interpretability track! Read @tyleryzhu's thread for more details and how to submit!
The interpretability track of the 3rd Perception Test Challenge #ICCV2025 is now live! We're looking for new, exciting techniques for understanding how perception models make predictions on video tasks, from mechanistic methods to black-box analyses and visuals. DDL: Oct 6th
0
0
2