Nikita Karaev
@n_karaev
Followers
3K
Following
659
Media
14
Statuses
104
Building general purpose robotics at Amazon FAR. @a16z Speedrun Scout. Previously founder @ https://t.co/wmWzrnsbec / @AIatMeta / @Oxford_VGG PhD. https://t.co/U668erUjkd
London, England
Joined October 2013
🚨Big news - our team is joining the Frontier AI and Robotics (FAR) lab at @amazon to keep building the future of robotics. Excited to be joining such an inspiring team and to work with @peterxichen, @pabbeel, @rocky_duan, @akanazawa and many others at FAR.
19
24
504
Wow I didn't expect to receive so many DMs (400+), I'll go through them one by one. The pattern I noticed so far, which wasn't obvious to me a year ago - most people jump straight into pitching their idea without saying a word about the team. The team is actually *the* most
21
2
150
I see some questions about the check size. A $10k check is indeed quite symbolic, Speedrun itself invests much more. The main value lies in the connections, experience and the support for applying to speedrun, not in the money itself.
7
0
39
Personal update - I'm now officially an a16z @speedrun scout. This means I can write you a 10k$ check within 24 hours. If you're an early-stage founder or thinking about starting a company and have done something cool before - my DMs are open! I'm mostly into AI/robotics, but
177
72
2K
Thanks to AK for sharing our work! 🧩 Code: https://t.co/yT1wbOtVT9 🌐 Project Page: https://t.co/B8widtJ6DT 📄 The final version of our paper is coming in a few days — stay tuned!
github.com
[ICCV 2025] SpatialTrackerV2: 3D Point Tracking Made Easy - henry123-boy/SpaTrackerV2
3
18
103
🚀 We release SpatialTrackerV2: the first feedforward model for dynamic 3D reconstruction and 3D point tracking — all at once! Reconstruct dynamic scenes and predict pixel-wise 3D motion in seconds. 🔗 Webpage: https://t.co/B8widtJ6DT 🔍 Online Demo: https://t.co/sY9iO7wCgT
5
90
465
Really enjoyed working with @YuxiXiaohenry @jianyuan_wang @NanXue7 @lvoursl @bingyikang Xin Zhu Hujun Bao Yujun Shen @XiaoweiZhou5 Amazing work guys!
0
0
6
With SpatialTrackerV2, you can reconstruct a dynamic scene and predict pixel-wise 3D motion in seconds. You can even try it out in the 🤗 demo on your phone by clicking on pre-selected examples or uploading your own video!
1
1
9
⚡️Today we’re releasing SpatialTrackerV2, the first feedforward model for dynamic 3D reconstruction and point tracking in the world space. Project page: https://t.co/5usIYejRhH 🤗 demo: https://t.co/qCP2ojIHOb
2
22
140
Many Congratulations to @jianyuan_wang, @MinghaoChen23, @n_karaev, Andrea Vedaldi, Christian Rupprecht and @davnov134 for winning the Best Paper Award @CVPR for "VGGT: Visual Geometry Grounded Transformer" 🥇🎉 🙌🙌 #CVPR2025!!!!!!
17
70
492
I love how people are finding new ways to apply CoTracker in robotics
Reachy pick and place with base movement ! Reachy is now able to pick objects and place it with both his arms and his wheels with dora-rs! This is done by adding @metaai Cotracker to dora thanks to @ShashwatPatil ! If you want to know more about Pollen Robotics, dora-rs or
0
0
6
🔥 A new state-of-the-art point tracker from @GoogleDeepMind!
We're very excited to introduce TAPNext: a model that sets a new state-of-art for Tracking Any Point in videos, by formulating the task as Next Token Prediction. For more, see: https://t.co/HaUzOuP1kH 🧵
0
0
7
This new 3D transformer VGGT is epic! (CVPR'25) 🔵 I just tried running it against 28 pics of my coral reef pictures (from GoPro) and in less than 15s got a 3D point cloud + camera positions ready. 🔵 Everything is end-to-end DL, no traditional SfM. Could instantly see all the
Introducing VGGT (CVPR'25), a feedforward Transformer that directly infers all key 3D attributes from one, a few, or hundreds of images, in seconds! No expensive optimization needed, yet delivers SOTA results for: ✅ Camera Pose Estimation ✅ Multi-view Depth Estimation ✅ Dense
3
22
144
VGGT: Visual Geometry Grounded Transformer @jianyuan_wang @MinghaoChen23 @n_karaev Andrea Vedaldi, Christian Rupprecht @davnov134 tl;dr: Mast3r meets CoTracker, overcomplete predictions (depth AND pointmap, etc).Training: 64A100x9days. Sota on IMC-2020. https://t.co/ub3rufJHxz
1
10
89
Check out our latest work, VGGT! ⚡️ It’s a fast transformer that predicts cameras, point maps, depth and 3D point tracks. VGGT + CoTracker significantly outperforms CoTracker in 2D point tracking.
Introducing VGGT (CVPR'25), a feedforward Transformer that directly infers all key 3D attributes from one, a few, or hundreds of images, in seconds! No expensive optimization needed, yet delivers SOTA results for: ✅ Camera Pose Estimation ✅ Multi-view Depth Estimation ✅ Dense
1
10
60
We believe this approach can scale even more and enable the use of internet videos with human actions to train a foundation policy model. This approach is more general and scales much better than teleoperation. If you’re excited about this research - let’s chat!
1
1
2
Microsoft @jw2yang4ai just released a VLM, Magma, where they presented a scalable way to train policy models on real data by annotating 10M videos using CoTracker, point tracking model that we developed while back at Meta AI. Magma: https://t.co/PsnfcXQqys
1
12
87
Join us on January 16 for a talk on Meta's CoTracker3: Simpler and Better Point Tracking by Pseudo-Labeling Real Videos 🚀 @n_karaev will explain how it works and share research insights from CoTracker and CoTracker3. Don’t miss it! https://t.co/76o0nzHtaZ
luma.com
Last year Meta introduced CoTracker — a transformer-based model that jointly tracks points in a video. After the initial release of CoTracker, the model's…
0
2
3