Laura Leal-Taixe
@lealtaixe
Followers
12K
Following
2K
Media
69
Statuses
519
Senior Research Manager at @NVIDIA. Prev Professor at @TU_Muenchen. Computer Vision mostly. Views are my own.
Joined June 2016
Laura Leal-Taixé (@lealtaixe ) is a Senior Research Manager at @NVIDIAAI . Interview in Catalan with @neurofregides at #DLBCN 2024, hosted at @LaSalleBCN.
3
3
17
[7/N] ViPE code is released at: https://t.co/l5cXq7C8Fo! Huge thanks to the incredible team behind this @QunjieZhou @HesamRabeti Aleksandr @HuanLing6 @xuanchi13 @TianchangS @JunGao33210520 Dmitry @chenhsuanlin @jiawei6_ren Kevin @Joydeepb_robots @lealtaixe @FidlerSanja
github.com
ViPE: Video Pose Engine for Geometric 3D Perception - nv-tlabs/vipe
4
3
34
[1/N] 🎥 We've made available a powerful spatial AI tool named ViPE: Video Pose Engine, to recover camera motion, intrinsics, and dense metric depth from casual videos! Running at 3–5 FPS, ViPE handles cinematic shots, dashcams, and even 360° panoramas. 🔗 https://t.co/1mGDxwgYJt
13
105
452
Can we learn to complete anything in Lidar without any manual supervision? Excited to share our #ICML2025 paper “Towards Learning to Complete Anything in Lidar” from my time at @nvidia with @CristianoSalto @NeeharPeri @meinhardt_tim @RdeLutio @AljosaOsep @lealtaixe! Thread🧵👇
1
12
60
Curious about 3D Gaussians, simulation, rendering and the latest from #NVIDIA? Come to the NVIDIA Kaolin Library live-coding session at #CVPR2025, powered by a cloud GPU reserved especially for you. Wed, Jun 11, 8-noon. Bring your laptop! https://t.co/joCH5DDrNk
1
20
46
Excited to share what we've been working on! SeNaTra introduces a backbone where segmentation emerges natively by replacing standard downsampling with grouping layers. Opens the door for a new family of zero-shot segmentation-centric backbone architectures! 🚀 Code coming soon!
The time for new architectures is over? Not quite! SeNaTra, a native segmentation backbone, is waiting, let's see how it works 🧵 https://t.co/2I9nuLBsSz
2
15
98
Turns out that if you learn to downsample (rather than using uniform grid pooling) in Vision Transformers, you no longer need dedicated upsampling layers and segmentation heads—dense image segmentation emerges natively.
arxiv.org
Uniform downsampling remains the de facto standard for reducing spatial resolution in vision backbones. In this work, we propose an alternative design built around a content-aware spatial grouping...
The time for new architectures is over? Not quite! SeNaTra, a native segmentation backbone, is waiting, let's see how it works 🧵 https://t.co/2I9nuLBsSz
0
9
77
0
0
8
Coolest results on zero-shot, text-supervised semantic segmentation as well as a new kid in town for supervised semantic segmentation: the native segmentation network, the first encoder-only model capable of competing with Mask2former and the other big ones.
1
0
9
The coolest thing is that segmentation emerges even from Imagenet pre-training!
1
1
11
The secret sauce is a learned spatial grouping layer, which computes soft token assignments. The cool thing is that this enables principled feature upsampling, from masks to pixels in the encoder itself! No more heavy decoders needed!
1
1
14
Our work shows that segmentation can be inherently encoded in a model’s internal representations rather than delegated to specialized decoder modules, opening new directions in segmentation-centric backbone architectures.
1
2
9
The time for new architectures is over? Not quite! SeNaTra, a native segmentation backbone, is waiting, let's see how it works 🧵 https://t.co/2I9nuLBsSz
arxiv.org
Uniform downsampling remains the de facto standard for reducing spatial resolution in vision backbones. In this work, we propose an alternative design built around a content-aware spatial grouping...
3
41
207
A Guide to Structureless Visual Localization Vojtech Panek, Qunjie Zhou, Yaqing Ding, Sérgio Agostinho, @ZKukelova, @SattlerTorsten, @lealtaixe tl;dr: structureless localization review https://t.co/6KrUYu1iBk
1
15
57
Thanks @_akhaliq for sharing! During my internship at @NVIDIAAI, we explored zero-shot panoptic completion of Lidar scans — together with @CristianoSalto @NeeharPeri @meinhardt_tim @RdeLutio @lealtaixe @AljosaOsep!
2
12
72
Spatial AI is increasingly important, and the newest papers from #NVIDIAResearch, 3DGRT and 3DGUT, represent significant advancements in enabling researchers and developers to explore and innovate with 3D Gaussian Splatting techniques. 💎 3DGRT (Gaussian Ray Tracing) ➡️
2
49
224
MATCHA:Towards Matching Anything @FeiXue94, @s_elflein, @lealtaixe, @QunjieZhou tl;dr: diffusion model->semantic+geometric features->transformer-based fusion->enhanced diffusion features->w/ DINOv2->unified feature->geometric/semantic/temporal matching https://t.co/WWjr9QEyjD
1
29
126
If you want to try out our 3DV paper #DynOMo for dynamic, online, monocular reconstruction-based point tracking, you can do so now ☺️💃 @lealtaixe @QunjieZhou @BDuisterhof @RamananDeva
https://t.co/La0rfaEdDj
2
4
28
To appear at 3DV! Congrats to the team, especially @JennySeidensch1 !
You wondered how point tracks generated from dynamic, online, monocular reconstruction look in action? Enjoy the sneak peak of #DynOMo on TAPVID-Davis, PanopticSports and the iPhone dataset! More visuals soon 🚀@lealtaixe @QunjieZhou @BDuisterhof @RamananDeva
0
2
59