
Anh-Quan Cao
@AnhQuanCAO
Followers
413
Following
3K
Media
2
Statuses
921
PhD Student in Computer Vision @inria. Previously @amazon, @TU_Muenchen, @Polytechnique and @UnivParisSaclay
Inria, Paris, France
Joined September 2020
🚀 Introducing InstantSfM: Fully Sparse and Parallel Structure-from-Motion. ✅ Python + GPU-optimized implementation, no C++ anymore! ✅ 40× faster than COLMAP with 5K images on single GPU! ✅ Scales beyond 100 images (more than VGGT/VGGSfM can consume)! ✅ Support metric scale.
5
39
305
🔷 Introducing Rig3R - our new geometric foundation model for 3D perception in AVs. https://t.co/S7kkAK0OBP
3
84
530
The 3rd is the charm: the PhD defense of Ivan Lopes at Inria! His research is on Materials, Geometry & Semantics Estimation for scene understanding and editing. Papers: Material Palette, Material Transform and MatSwap (Materials) & StableMTL and DenseMTL (Multi-task learning).
1
2
10
Another great event for @valeoai: a PhD defense of Corentin Sautier. His thesis «Learning Actionable LiDAR Representations w/o Annotations» covers the papers BEVContrast (learning self-sup LiDAR features), SLidR, ScaLR (distillation), UNIT and Alpine (solving tasks w/o labels).
1
3
16
It’s PhD graduation season in the team! Today, @Bjoern_Michele is defending his PhD on "Domain Adaptation for 3D Data" Best of luck! 🚀
1
5
20
Congratulations to our lab colleagues who have been named Outstanding Reviewers at #ICCV2025 👏 Andrei Bursuc @abursuc Anh-Quan Cao @AnhQuanCAO Renaud Marlet @RenaudMarlet Eloi Zablocki @EloiZablocki
@ICCVConference 🔗
0
3
12
Discovered that our RangeViT paper keeps being cited in what might be LLM-generated papers. Number of citations increased rapidly in the last weeks. Too good to be true. Papers popped up on different platforms, but mainly on ResearchGate with ~80 papers in just 3 weeks. [1/]
1
2
6
Very honored to receive Best Paper Award for TokenVerse at @Siggraph today! @DanielGaribi will be presenting it tomorrow morning, go check it out! https://t.co/Q50rm7JkjR
22
24
308
NeurIPS rebuttal deadline is around the corner 😬 I’m not an expert, but thought I’d drop my two cents on how to write a good rebuttal, especially for folks writing their first few. Hope this helps someone! 🧵👇 (And please chime in with your own tips; let’s crowdsource the
8
51
490
Can open-data models beat DINOv2? Today we release Franca, a fully open-sourced vision foundation model. Franca with ViT-G backbone matches (and often beats) proprietary models like SigLIPv2, CLIP, DINOv2 on various benchmarks setting a new standard for open-source research🧵
13
57
274
Video recordings from our workshop on Embodied Intelligence and tutorial on Robotics 101 @CVPR are now up, just in time to catch up with things over the summer. Enjoy! #CVPR2025
📹Our #CVPR2025 workshop and tutorial recordings are now online! Big thanks to our incredible speakers! Watch all the sessions here 🔗 Workshop: https://t.co/xLbnLvOVYM 🔗 Tutorial: https://t.co/17QDuODLz4 🏟️But we’re not done yet - our workshop continues at #ICCV2025! And the
0
7
22
🎉 Excited to share that our paper "FLOSS: Free Lunch in Open-vocabulary Semantic Segmentation" got accepted at #ICCV2025! A collaborative effort with : Mohammad Fahes @tuan_hung_vu @abursuc and Raoul de Charette.
1
3
12
Excited to share FLOSS, a training-free label-free plug-and-play to improve open-vocabulary semantic segmentation. See the thread of @yasserbenigmim for more details. #FLOSS #OVSS #ICCV25
🎉 Excited to share that our paper "FLOSS: Free Lunch in Open-vocabulary Semantic Segmentation" got accepted at #ICCV2025! A collaborative effort with : Mohammad Fahes @tuan_hung_vu @abursuc and Raoul de Charette.
0
3
11
Check out our #ICCV2025 work on functional 3d scan editing, learning to optimize, multi-level 3d captioning, interactive mesh editing, audio-driven avatars, & shape matching! Congrats @ElBoudjogh24002, @liuyuehcheng, @chandan__yes, @hcxrli, @shivangi2201, Emery for amazing work!
2
26
122
1/ New & old work on self-supervised representation learning (SSL) with ViTs: MOCA ☕ - Predicting Masked Online Codebook Assignments w/ @SpyrosGidaris @oriane_simeoni @AVobecky @quobbe N. Komodakis, P. Pérez #TMLR #ICLR2025 Grab a ☕ and brace for a story & a 🧵
1
14
48
A surprising & little-known results in classical statistics: Mean (μ) and median (m) are within one std deviation: |μ−m| ≤ σ For unimodal densities, bound is even tighter |μ−m| ≤ 0.7746 σ This beautiful results first appeared in a 1932 paper by Hotelling & Solomons 1/3
27
196
2K
New paper out - accepted at @ICCVConference We introduce MoSiC, a self-supervised learning framework that learns temporally consistent representations from video using motion cues. Key idea: leverage long-range point tracks to enforce dense feature coherence across time.🧵
2
24
129
1/n 🚀New paper out - accepted at @ICCVConference! Introducing DIP: unsupervised post-training that enhances dense features in pretrained ViTs for dense in-context scene understanding Below: Low-shot in-context semantic segmentation examples. DIP features outperform DINOv2!
2
26
120
I am at #CVPR2025 this week in Nashville! Presenting "Advancing Semantic Future Prediction through Multimodal Visual Sequence Transformers" on multi-modal semantic future prediction. Come discuss! Fri 13 Jun 10:30-12:30, poster #345
https://t.co/VdTXnfZopo
🧵 Excited to share our latest work: FUTURIST - A unified transformer architecture for multimodal semantic future prediction, is accepted to #CVPR2025 ! Here's how it works (1/n) 👇 Links to the arxiv and github below
0
5
11