
Spyros Gidaris
@SpyrosGidaris
Followers
105
Following
161
Media
0
Statuses
84
Senior Research Scientist @valeoai
Joined June 2023
Really excited to be giving a talk on “Openness of Vision Foundation Models” at the FOUND workshop tomorrow (19 Oct) at 10:20am, room 316C. Thanks to @HirokatuKataoka and colleagues for the invite. Looking forward to interacting with you all.
At ICCV 2025, I am organizing two workshops: the LIMIT Workshop and the FOUND Workshop. ◆ LIMIT Workshop (19 Oct, PM): https://t.co/45EjrxXtwH ◆ FOUND Workshop (19 Oct, AM): https://t.co/JpGBU1KKfx We warmly invite you to attend at these workshops in ICCV 2025 Hawaii!
1
8
13
So excited to attend the PhD defense of @Bjoern_Michele at @valeoai! He’s presenting his research results of the last 3 years in 3D domain adaptation: SALUDA (unsupervised DA), MuDDoS (multimodal UDA), TTYD (source-free UDA).
1
2
10
Another great event for @valeoai: a PhD defense of Corentin Sautier. His thesis «Learning Actionable LiDAR Representations w/o Annotations» covers the papers BEVContrast (learning self-sup LiDAR features), SLidR, ScaLR (distillation), UNIT and Alpine (solving tasks w/o labels).
1
4
16
Welcome to 50th Pattern Recognition and Computer Vision Colloquium with: - @SattlerTorsten - @pesarlin - @VickyKalogeiton - Spyros Gidaris - @anna_kukleva_ - @lukneu Thursday Oct 9, 11:00-17:00. You still have time to buy tickets to Prague! https://t.co/YIkBkugOCF
1
9
56
It’s PhD graduation season in the team! Today, @Bjoern_Michele is defending his PhD on "Domain Adaptation for 3D Data" Best of luck! 🚀
1
6
21
3. MuToR with @NasosGer & Nikos Komodakis Multi-token prediction with registers 🔗 Paper: https://t.co/bGBHH6mVl2 🐦 Post:
arxiv.org
Multi-token prediction has emerged as a promising objective for improving language model pretraining, but its benefits have not consistently generalized to other settings such as fine-tuning. In...
1/n Multi-token prediction boosts LLMs (DeepSeek-V3), tackling key limitations of the next-token setup: • Short-term focus • Struggles with long-range decisions • Weaker supervision Prior methods add complexity (extra layers) 🔑 Our fix? Register tokens—elegant and powerful
0
0
3
2. ReDi (spotlight) with @ThKouz, @IoannisKakogeo1 & Nikos Komodakis Boosting generative image modeling via joint image-feature synthesis 🔗 Paper: https://t.co/1uJ6NdnxA3 🐦 Post:
arxiv.org
Latent diffusion models (LDMs) dominate high-quality image generation, yet integrating representation learning with generative modeling remains a challenge. We introduce a novel generative image...
1/n Introducing ReDi (Representation Diffusion): a new generative approach that leverages a diffusion model to jointly capture – Low-level image details (via VAE latents) – High-level semantic features (via DINOv2)🧵
1
0
4
1. DINO-Foresight with @K_Sta8is, @IoannisKakogeo1 & Nikos Komodakis Future state prediction using vision foundation model features 🔗 Paper: https://t.co/q2dKU4yPFn 🐦 Post:
arxiv.org
Predicting future dynamics is crucial for applications like autonomous driving and robotics, where understanding the environment is key. Existing pixel-level methods are computationally expensive...
1/n 🚀 Excited to share our latest work: DINO-Foresight, a new framework for predicting the future states of scenes using Vision Foundation Model features! Links to the arXiv and Github 👇
1
0
3
Three papers accepted to #NeurIPS2025 (one spotlight)! 🎉 Awesome works in generative modeling, multi-token prediction, and semantic future prediction. Congratulations to all students and collaborators involved! @NasosGer, @K_Sta8is, @ThKouz, @IoannisKakogeo1 & Nikos Komodakis!
1
3
12
Our paper on multi-token prediction with registers (MuToR) has been accepted to #NeurIPS2025 😀. I am grateful to my co-authors, @SpyrosGidaris and Nikos Komodakis, for their guidance and collaboration. See you in San Diego! If you are interested, check our detailed thread 👇
1/n Multi-token prediction boosts LLMs (DeepSeek-V3), tackling key limitations of the next-token setup: • Short-term focus • Struggles with long-range decisions • Weaker supervision Prior methods add complexity (extra layers) 🔑 Our fix? Register tokens—elegant and powerful
2
10
97
Our paper DINO-Foresight is accepted to #NeurIPS2025 🎉! Grateful to my wonderful collaborators @IoannisKakogeo1, @SpyrosGidaris and Nikos Komodakis💪 More updates soon— stay tuned! ✨
1/n 🚀 Excited to share our latest work: DINO-Foresight, a new framework for predicting the future states of scenes using Vision Foundation Model features! Links to the arXiv and Github 👇
5
23
208
ReDi has been accepted at #NeurIPS2025 as a Spotlight! Huge thanks to @K_Sta8is, @IoannisKakogeo1, @SpyrosGidaris, and Nikos Komodakis for their guidance and collaboration! This was such a fun project to work on.
1/n Introducing ReDi (Representation Diffusion): a new generative approach that leverages a diffusion model to jointly capture – Low-level image details (via VAE latents) – High-level semantic features (via DINOv2)🧵
3
19
146
Really excited to talk about our recent open-source vision foundation model: Franca with @v_pariza. Thanks to the Cohere Vision community @cataluna84 and @Arkhymadhe for the invite. Join us on 23rd Sept at 5pm CET to gain insights on training large scale models.
Our Computer Vision Group is looking forward to hosting @shawshank_v and Valentinos Pariza for a presentation of "Franca: Nested Matryoshka Clustering for Scalable Visual Representation Learning." Thanks to @cataluna84 and @Arkhymadhe for organizing this guest speaker session!
1
5
21
Discovered that our RangeViT paper keeps being cited in what might be LLM-generated papers. Number of citations increased rapidly in the last weeks. Too good to be true. Papers popped up on different platforms, but mainly on ResearchGate with ~80 papers in just 3 weeks. [1/]
1
2
6
🚨 Update on ReDi 🚨 We find that ReDi is complementary to REPA 🤝. A notable result, given that both leverage DINOv2 to accelerate diffusion training. ⚡ ReDi+REPA matches fully-converged (4M iters) REPA FID after just 350K iters.
1/n Introducing ReDi (Representation Diffusion): a new generative approach that leverages a diffusion model to jointly capture – Low-level image details (via VAE latents) – High-level semantic features (via DINOv2)🧵
2
10
99
@giffmana Super interesting! You might like our work SPOT ( https://t.co/kuDJVi8aQt), we found that shuffling token orders in slot attention with AR decoders strengthens the role of conditioning signals (slots). Fun fact @giffmana: Insights from your CapPa work was a big inspiration! 😊
0
1
5
We are pleased to release of the code and checkpoints for Franca: https://t.co/UBZRQmlujm We look forward to seeing how Franca performs in your various use cases and welcome any insights or feedback you may have.
github.com
Official code of Franca: Nested Matryoshka Clustering for Scalable Visual Representation Learning - valeoai/Franca
Can open-data models beat DINOv2? Today we release Franca, a fully open-sourced vision foundation model. Franca with ViT-G backbone matches (and often beats) proprietary models like SigLIPv2, CLIP, DINOv2 on various benchmarks setting a new standard for open-source research🧵
0
3
24
Latent Denoising Makes Good Visual Tokenizers "we introduce the Latent Denoising Tokenizer (l-DeTok), a simple yet effective tokenizer trained to reconstruct clean images from latent embeddings corrupted by interpolative noise and random masking. Extensive experiments on
2
25
168
We're releasing Franca ("free" one): a high-performing open-source vision foundation model. Franca is the outcome of a close collaboration between @valeoai (in France) and @FunAILab (in Franconia). Check out the thread for more info on main ingredients and results 👇
Can open-data models beat DINOv2? Today we release Franca, a fully open-sourced vision foundation model. Franca with ViT-G backbone matches (and often beats) proprietary models like SigLIPv2, CLIP, DINOv2 on various benchmarks setting a new standard for open-source research🧵
0
2
26