
Max Seitzer
@maxseitzer
Followers
599
Following
37
Media
35
Statuses
79
Researcher in the DINO team at Meta FAIR. Before: PhD at Max Planck Institute for Intelligent Systems, Tübingen. Representation learning, agents, structure.
Joined January 2021
Introducing DINOv3 🦕🦕🦕. A SotA-enabling vision foundation model, trained with pure self-supervised learning (SSL) at scale. High quality dense features, combining unprecedented semantic and geometric scene understanding. Three reasons why this matters…
12
139
1K
4) These outliers are *different* from high norm tokens studied by the registers paper or by An et al in LLMs (, and are not removed by registers & attention bias. My guess is that they are specific to the DINO setup of different heads & losses.
arxiv.org
Outliers have been widely observed in Large Language Models (LLMs), significantly impacting model performance and posing challenges for model compression. Understanding the functionality and...
1
0
3
Nice investigation! We did study those outlier dimensions a bit for the 7B model, which is summarized in section A.2 of the paper. Some comments:.
DINO-v3 has a single high-magnitude channel on its residual pathway, channel 416. Turning off this single channel affects DINO's entire output by 50-80%. For context, turning off a random channel has an effect of less than one percent. The model builds up channel 416 in its last
1
0
14
There are still many more interesting aspects to the paper (it's massive!), so please have a read.
arxiv.org
Self-supervised learning holds the promise of eliminating the need for manual data annotation, enabling models to scale effortlessly to massive datasets and larger architectures. By not being...
0
0
4
The DINOv3 paper is now available on arXiv: Have you looked at the paper yet?. Here are three observations that might not be immediately obvious from a first read 👇.
arxiv.org
Self-supervised learning holds the promise of eliminating the need for manual data annotation, enabling models to scale effortlessly to massive datasets and larger architectures. By not being...
Introducing DINOv3 🦕🦕🦕. A SotA-enabling vision foundation model, trained with pure self-supervised learning (SSL) at scale. High quality dense features, combining unprecedented semantic and geometric scene understanding. Three reasons why this matters…
1
0
9
RT @arkitus: This figure from the impressive DINOv3 paper is fun to think about. Pretend it's 2018 and you're deciding what research to foc….
0
16
0
RT @TimDarcet: hey we heard you liked dinov2 so we got you more of the same shit. dinov3 is like dinov2 in the sense that it's much better….
0
11
0
RT @MarcSzafraniec: Proud to have contributed to the ground-breaking DINOv3 by reaching the SOTA on COCO Object Detection, for the first ti….
0
7
0
RT @AIatMeta: Introducing DINOv3: a state-of-the-art computer vision model trained with self-supervised learning (SSL) that produces powerf….
0
757
0
RT @BaldassarreFe: Say hello to DINOv3 🦖🦖🦖. A major release that raises the bar of self-supervised vision foundation models. With stunning….
0
275
0
… @TimDarcet, @TheoMoutakanni, Leonel Sentana, @_claireroberts, Andrea Vedaldi, Jamie Tolan, John Brandt, Camille Couprie, @julienmairal, @hjegou, @monsieurlabatut, @p_bojanowski. And of course, it’s open source! 📜 Paper:
github.com
Reference PyTorch implementation and models for DINOv3 - facebookresearch/dinov3
1
3
63
Immensely proud to have been part of this project. Thank you to the team: @oriane_simeoni, @huyvvo, @BaldassarreFe, Maxime Oquab, Cijo Jose, Vasil Khalidov, @MarcSzafraniec, Seungeun Yi, @MichaelRamamon, @fvsmassa, @d_haziza, @LucaWehrstedt, @jianyuan_wang, …
1
0
51