Laura Leal-Taixe @lealtaixe X Profile

Laura Leal-Taixe

@lealtaixe

Followers

12K

Following

2K

Media

69

Statuses

519

Senior Research Manager at @NVIDIA. Prev Professor at @TU_Muenchen. Computer Vision mostly. Views are my own.

https://t.co/NvBDQuarPl

Joined June 2016

Don't wanna be here? Send us removal request.

Deep Learning Barcelona Symposium

@dlbcnai

3 months

Laura Leal-Taixé (@lealtaixe ) is a Senior Research Manager at @NVIDIAAI . Interview in Catalan with @neurofregides at #DLBCN 2024, hosted at @LaSalleBCN.

3

17

Jiahui Huang

@huangjh_hjh

3 months

[7/N] ViPE code is released at: https://t.co/l5cXq7C8Fo! Huge thanks to the incredible team behind this @QunjieZhou @HesamRabeti Aleksandr @HuanLing6 @xuanchi13 @TianchangS @JunGao33210520 Dmitry @chenhsuanlin @jiawei6_ren Kevin @Joydeepb_robots @lealtaixe @FidlerSanja

github.com

ViPE: Video Pose Engine for Geometric 3D Perception - nv-tlabs/vipe

4

3

34

Jiahui Huang

@huangjh_hjh

3 months

[1/N] 🎥 We've made available a powerful spatial AI tool named ViPE: Video Pose Engine, to recover camera motion, intrinsics, and dense metric depth from casual videos! Running at 3–5 FPS, ViPE handles cinematic shots, dashcams, and even 360° panoramas. 🔗 https://t.co/1mGDxwgYJt

13

105

452

Ayça Takmaz

@aycatakmaz

4 months

Can we learn to complete anything in Lidar without any manual supervision? Excited to share our #ICML2025 paper “Towards Learning to Complete Anything in Lidar” from my time at @nvidia with @CristianoSalto @NeeharPeri @meinhardt_tim @RdeLutio @AljosaOsep @lealtaixe! Thread🧵👇

1

12

60

Masha Shugrina

@_shumash

5 months

Curious about 3D Gaussians, simulation, rendering and the latest from #NVIDIA? Come to the NVIDIA Kaolin Library live-coding session at #CVPR2025, powered by a cloud GPU reserved especially for you. Wed, Jun 11, 8-noon. Bring your laptop! https://t.co/joCH5DDrNk

1

20

46

Guillem Brasó

@GuillemBraso

6 months

Excited to share what we've been working on! SeNaTra introduces a backbone where segmentation emerges natively by replacing standard downsampling with grouping layers. Opens the door for a new family of zero-shot segmentation-centric backbone architectures! 🚀 Code coming soon!

Laura Leal-Taixe

@lealtaixe

6 months

The time for new architectures is over? Not quite! SeNaTra, a native segmentation backbone, is waiting, let's see how it works 🧵 https://t.co/2I9nuLBsSz

2

15

98

Aljosa

@AljosaOsep

6 months

Turns out that if you learn to downsample (rather than using uniform grid pooling) in Vision Transformers, you no longer need dedicated upsampling layers and segmentation heads—dense image segmentation emerges natively.

arxiv.org

Uniform downsampling remains the de facto standard for reducing spatial resolution in vision backbones. In this work, we propose an alternative design built around a content-aware spatial grouping...

Laura Leal-Taixe

@lealtaixe

6 months

The time for new architectures is over? Not quite! SeNaTra, a native segmentation backbone, is waiting, let's see how it works 🧵 https://t.co/2I9nuLBsSz

0

9

77

Laura Leal-Taixe

@lealtaixe

6 months

Fantastic work with the talented @GuillemBraso and @AljosaOsep. @NVIDIAAI #NVIDIAResearch

0

8

Laura Leal-Taixe

@lealtaixe

6 months

Coolest results on zero-shot, text-supervised semantic segmentation as well as a new kid in town for supervised semantic segmentation: the native segmentation network, the first encoder-only model capable of competing with Mask2former and the other big ones.

1

0

9

Laura Leal-Taixe

@lealtaixe

6 months

The coolest thing is that segmentation emerges even from Imagenet pre-training!

1

11

Laura Leal-Taixe

@lealtaixe

6 months

The secret sauce is a learned spatial grouping layer, which computes soft token assignments. The cool thing is that this enables principled feature upsampling, from masks to pixels in the encoder itself! No more heavy decoders needed!

1

14

Laura Leal-Taixe

@lealtaixe

6 months

Our work shows that segmentation can be inherently encoded in a model’s internal representations rather than delegated to specialized decoder modules, opening new directions in segmentation-centric backbone architectures.

1

2

9

Laura Leal-Taixe

@lealtaixe

6 months

The time for new architectures is over? Not quite! SeNaTra, a native segmentation backbone, is waiting, let's see how it works 🧵 https://t.co/2I9nuLBsSz

arxiv.org

Uniform downsampling remains the de facto standard for reducing spatial resolution in vision backbones. In this work, we propose an alternative design built around a content-aware spatial grouping...

3

41

207

Zhenjun Zhao

@zhenjun_zhao

7 months

A Guide to Structureless Visual Localization Vojtech Panek, Qunjie Zhou, Yaqing Ding, Sérgio Agostinho, @ZKukelova, @SattlerTorsten, @lealtaixe tl;dr: structureless localization review https://t.co/6KrUYu1iBk

1

15

57

AK

@_akhaliq

7 months

Nvidia just announced Towards Learning to Complete Anything in Lidar

9

55

413

Ayça Takmaz

@aycatakmaz

7 months

Thanks @_akhaliq for sharing! During my internship at @NVIDIAAI, we explored zero-shot panoptic completion of Lidar scans — together with @CristianoSalto @NeeharPeri @meinhardt_tim @RdeLutio @lealtaixe @AljosaOsep!

AK

@_akhaliq

7 months

Nvidia just announced Towards Learning to Complete Anything in Lidar

2

12

72

NVIDIA AI Developer

@NVIDIAAIDev

8 months

Spatial AI is increasingly important, and the newest papers from #NVIDIAResearch, 3DGRT and 3DGUT, represent significant advancements in enabling researchers and developers to explore and innovate with 3D Gaussian Splatting techniques. 💎 3DGRT (Gaussian Ray Tracing) ➡️

2

49

224

Zhenjun Zhao

@zhenjun_zhao

10 months

MATCHA:Towards Matching Anything @FeiXue94, @s_elflein, @lealtaixe, @QunjieZhou tl;dr: diffusion model->semantic+geometric features->transformer-based fusion->enhanced diffusion features->w/ DINOv2->unified feature->geometric/semantic/temporal matching https://t.co/WWjr9QEyjD

1

29

126

JennySeidenschwarz

@JennySeidensch1

10 months

If you want to try out our 3DV paper #DynOMo for dynamic, online, monocular reconstruction-based point tracking, you can do so now ☺️💃 @lealtaixe @QunjieZhou @BDuisterhof @RamananDeva https://t.co/La0rfaEdDj

2

4

28

Laura Leal-Taixe

@lealtaixe

1 year

To appear at 3DV! Congrats to the team, especially @JennySeidensch1 !

JennySeidenschwarz

@JennySeidensch1

1 year

You wondered how point tracks generated from dynamic, online, monocular reconstruction look in action? Enjoy the sneak peak of #DynOMo on TAPVID-Davis, PanopticSports and the iPhone dataset! More visuals soon 🚀@lealtaixe @QunjieZhou @BDuisterhof @RamananDeva

0

2

59