Napoolar Profile Banner
Thomas Fel Profile
Thomas Fel

@Napoolar

Followers
2K
Following
10K
Media
93
Statuses
1K

Explainability, Computer Vision, Neuro-AI @Harvard. Research Fellow @KempnerInst. Prev. @tserre lab, @Google, @GoPro. Crêpe lover.

Boston, MA
Joined February 2017
Don't wanna be here? Send us removal request.
@Napoolar
Thomas Fel
26 days
🕳️🐇Into the Rabbit Hull – Part I (Part II tomorrow) An interpretability deep dive into DINOv2, one of vision’s most important foundation models. And today is Part I, buckle up, we're exploring some of its most charming features.
10
119
639
@SashaBoguraev
Sasha Boguraev @ EMNLP
6 months
A key hypothesis in the history of linguistics is that different constructions share underlying structure. We take advantage of recent advances in mechanistic interpretability to test this hypothesis in Language Models. New work with @kmahowald and @ChrisGPotts! 🧵👇
2
27
97
@GoodfireAI
Goodfire
3 days
LLMs memorize a lot of training data, but memorization is poorly understood. Where does it live inside models? How is it stored? How much is it involved in different tasks? @jack_merullo_ & @srihita_raju's new paper examines all of these questions using loss curvature! (1/7)
10
126
776
@TamarRottShaham
Tamar Rott Shaham
4 days
A key challenge for interpretability agents is knowing when they’ve understood enough to stop experimenting. Our @NeurIPSConf paper introduces a self-reflective agent that measures the reliability of its own explanations and stops once its understanding of models has converged.
2
27
46
@ravfogel
Shauli Ravfogel
17 days
New NeurIPS paper! 🐣Why do LMs represent concepts linearly? We focus on LMs's tendency to linearly separate true and false assertions, and provide a complete analysis of the truth circuit in a toy model. A joint work with @Giladude, @tallinzen, Joan Bruna and @albertobietti.
7
49
479
@Majumdar_Ani
Anirudha Majumdar
18 days
👁️ Geometry Meets Vision: Revisiting Pretrained Semantics in Distilled Fields We find that visual-only features (DINO) outperform visual-geometry features (VGGT) in spatial tasks! 👇
7
31
244
@wesg52
Wes Gurnee
19 days
New paper! We reverse engineered the mechanisms underlying Claude Haiku’s ability to perform a simple “perceptual” task. We discover beautiful feature families and manifolds, clean geometric transformations, and distributed attention algorithms!
45
314
2K
@ShamKakade6
Sham Kakade
18 days
(1/9) Diagonal preconditioners such as Adam typically use empirical gradient information rather than true second-order curvature. Is this merely a computational compromise or can it be advantageous? Our work confirms the latter: Adam can outperform Gauss-Newton in certain cases.
2
18
129
@MLMazda
Mazda Moayeri
20 days
I'll be speaking at this #ICCV2025 WS today at 4:30pm, room 308B. Fav part of prepping was talking to my friends *outside* of AI, who (turns out) don't trust these systems nor those who build them. Come by to hear how my perspective on trust has transformed over the past year.
@ShirleyYXWu
Shirley Wu
6 months
Can we ever truly trust foundation models—and if so, how? Our ICCV TrustFM workshop ( https://t.co/slBvSt9uUZ) is now accepting submissions (deadline: 8/1, attending: 10/19-10/23, Hawai'i) Submit, attend, and learn from everyone around the world who is making FMs more
0
3
17
@mc_mozer
Michael C. Mozer
23 days
[1/4] As you read words in this text, your brain adjusts fixation durations to facilitate comprehension. Inspired by human reading behavior, we propose a supervised objective that trains an LLM to dynamically determine the number of compute steps for each input token.
4
10
25
@nsaphra
Naomi Saphra
24 days
I’m recruiting PhD students for 2026! If you are interested in robustness, training dynamics, interpretability for scientific understanding, or the science of LLM analysis you should apply. BU is building a huge LLM analysis/interp group and you’ll be joining at the ground floor.
@nsaphra
Naomi Saphra
8 months
Life update: I'm starting as faculty at Boston University in 2026! BU has SCHEMES for LM interpretability & analysis, so I couldn't be more pumped to join a burgeoning supergroup w/ @najoungkim @amuuueller. Looking for my first students, so apply and reach out!
18
126
667
@AdriGarriga
Adrià Garriga-Alonso
24 days
Potentially field-defining work. I just know there will be follow-ups.
@Napoolar
Thomas Fel
25 days
🕳️🐇Into the Rabbit Hull – Part II Continuing our interpretation of DINOv2, the second part of our study concerns the geometry of concepts and the synthesis of our findings toward a new representational phenomenology: the Minkowski Representation Hypothesis
1
2
52
@Napoolar
Thomas Fel
24 days
Guess the rabbit hull goes deeper than expected haha🐰🎉 Thanks @ylecun !
2
2
83
@RemiCadene
Remi Cadene
25 days
I am starting a venture on top of LeRobot! We’re at a pivotal time. AI is moving beyond digital to the physical world. Embodied AI will change our surroundings in ways we can barely imagine. This technology holds the potential to empower everyone. It must not be controlled by
96
87
740
@Napoolar
Thomas Fel
25 days
That concludes this two-part descent into the Rabbit Hull. Huge thanks to all collaborators who made this work possible — and again especially to @WangBinxu with whom this project was built, experiment after experiment. 🎮 https://t.co/eKZJFatw4k 📄
0
1
19
@Napoolar
Thomas Fel
25 days
If this holds, three implications: (i) Concepts = points (or regions), not directions (ii) Probing is bounded: toward archetypes, not vectors (iii) Can't recover generating hulls from sum: we should look deeper than just a single-layer activations to recover the true latents
1
3
19
@Napoolar
Thomas Fel
25 days
Synthesizing these observations, we propose a refined view, motivated by Gärdenfors' theory and attention geometry. Activations = multiple convex hulls simultaneously: a rabbit among animals, brown among colors, fluffy among textures. The Minkowski Representation Hypothesis.
1
2
21
@Napoolar
Thomas Fel
25 days
Taken together, the signs of partial density, local connectedness, and coherent dictionary atoms indicate that DINO’s representations are organized beyond linear sparsity alone.
1
1
9
@Napoolar
Thomas Fel
25 days
Can position explain this ? We found that pos. information collapses: from high-rank to a near 2-dim sheet. Early layers encode precise location; later ones retain abstract axes. This compression frees dimensions for features, and *position doesn't explain PCA map smoothness*
1
1
13
@Napoolar
Thomas Fel
25 days
Patch embeddings form smooth, connected surfaces tracing objects and boundaries -- even after removing positional components. This may suggests interpolative geometry: tokens as mixtures between landmarks, shaped by clustering and spreading forces in the training objectives.
1
1
12
@Napoolar
Thomas Fel
25 days
We found antipodal feature pairs (dᵢ ≈ − dⱼ): vertical vs horizontal lines, white vs black shirts, left vs right… Also, co-activation statistics only moderately shape geometry: concepts that fire together aren't necessarily nearby—nor orthogonal when they don't.
1
1
13