Carl Doersch
@CarlDoersch
Followers
2K
Following
123
Media
19
Statuses
62
We present a new SOTA on point tracking, via self-supervised training on real, unlabeled videos! BootsTAPIR achieves 67.4% AJ on TAP-Vid DAVIS with minimal architecture changes, tracks 10K points on a 50-frame video in 6 secs. Pytorch & JAX impl on Github.
7
64
318
RT @KelseyRAllen: Humans can tell the difference between a realistic generated video and an unrealistic one – can models?. Excited to share….
0
14
0
Joint work with @artemZholus, @yangyi02, @skandakoppula, Viorica Patraucean, Xu Owen He, Ignacio Rocco, Mehdi Sajjadi, @apsarathchandar, @RGoroshin.
0
0
1
RT @dangengdg: What happens when you train a video generation model to be conditioned on motion?. Turns out you can perform "motion prompti….
0
147
0
Want a robot to solve a task, specified in language? Generate a video of a person doing it, and then retarget the action to the robot with the help of point tracking! Cool collab with @mangahomanga during his student researcher stint at Google.
Gen2Act: Casting language-conditioned manipulation as *human video generation* followed by *closed-loop policy execution conditioned on the generated video* enables solving diverse real-world tasks unseen in the robot dataset!. 1/n
0
0
5
RT @skandakoppula: We're excited to release TAPVid-3D: an evaluation benchmark of 4,000+ real world videos and 2.1 million metric 3D point….
0
58
0
RT @dimadamen: Can you win 2nd Perception Test Challenge? @eccvconf workshop: Diagnose Audio-visual MLM on ability….
0
13
0
Joint work with @paulineluc_, @yangyi02, @dilaragoekay, @skandakoppula, @ankshgpta, Joe Heyward, Ignacio Rocco, @RGoroshin, @joaocarreira, Andrew Zisserman. Video credit to GDM’s robot soccer project:
0
1
2
Joint work with @yangyi02, Mel Vecerik, @joaocarreira @tdavchev, @JonathanScholz2, Andrew Zisserman, @yusufaytar, Stannis Zhou, @dilaragoekay, Ankush Gupta, @LourdesAgapito, @RaiaHadsell.
0
1
4
Introducing TAPIR & RoboTAP, our latest research from @GoogleDeepMind. It focuses on spatial intelligence via point tracking, outlining how it enables applications from robotics to video generation to augmented reality, and more!
7
37
249
RT @dimadamen: 📢 Perception Test @ICCVConference now w/ Test Set. We invite submissions to 1st Perception Test- winners announced #ICCV2023….
github.com
Contribute to google-deepmind/perception_test development by creating an account on GitHub.
0
8
0