Thomas O'Connell Profile
Thomas O'Connell

@thomaspocon

Followers
244
Following
838
Media
11
Statuses
79

cognitive neuroscientist @KanwisherLab, @mitcocosci

Cambridge, MA
Joined January 2017
Don't wanna be here? Send us removal request.
@thomaspocon
Thomas O'Connell
2 years
šŸ”„Preprint AlertšŸ”„ Excited to share some new work modeling human 3D shape perception!. ā€œApproaching human 3D shape perception with neurally mappable modelsā€. w/ @tylerraye, @_yonifriedman, @_atewari, Josh Tenenbaum, @vincesitzmann, @Nancy_Kanwisher . 🧵
Tweet media one
4
50
156
@thomaspocon
Thomas O'Connell
8 months
RT @patrickmineault: Excited to release what we’ve been working on at Amaranth Foundation, our latest whitepaper, NeuroAI for AI safety! A….
0
97
0
@thomaspocon
Thomas O'Connell
10 months
RT @tylerraye: do large-scale vision models represent the 3D structure of objects?. excited to share our benchmark: multiview object consis….
0
88
0
@thomaspocon
Thomas O'Connell
2 years
RT @sucholutsky: šŸ§µšŸŽ‰ Our new preprint is up, and we’d love your feedback! We're "Getting Aligned on Representational Alignment" - the degree….
0
119
0
@thomaspocon
Thomas O'Connell
2 years
Thanks for reading! Please be in touch with any questions, ideas, desires to chat, etc!.
0
0
2
@thomaspocon
Thomas O'Connell
2 years
Finally, check out related work from my awesome co-author @tylerraye implicating the medial temporal lobe in supporting human 3D inferences. He suggests computations beyond standard DNNs are needed to model this process. Hmmm potential synergy? šŸ¤”.
@tylerraye
tyler bonnen
2 years
excited to share my last experimental project from graduate school šŸ˜…šŸ„¹. "Medial temporal cortex supports compositional visual inferences" . with my PhD advisors Anthony Wagner & Dan Yamins. 1/🧠.
1
0
6
@thomaspocon
Thomas O'Connell
2 years
Shout out to @KordingLab for tweeting out the CCN-version of the preprint! If you downloaded the earlier version, there are some minor updates in the one linked in this thread (addition of Stylized-ImageNet models, additional discussion).
@KordingLab
Kording Lab šŸ¦–
2 years
Replicating human 3d shape perception abilities - our field is changing so fast, I am having whiplash:
1
0
3
@thomaspocon
Thomas O'Connell
2 years
If watching talks is more your speed, check out our presentation from CCN 2023.
1
0
8
@thomaspocon
Thomas O'Connell
2 years
tldr: DNNs trained with multi-view objectives make human-aligned 3D shape judgements! Lots more work to do here (biologically-plausible learning, generalization) to reach full human capabilities, but we take an important step closing the gap between human and model 3D inferences.
1
0
5
@thomaspocon
Thomas O'Connell
2 years
So, are we done? Not quite…none of these models generalize well to novel categories not included in their training set. Closing this generalization gap will be a focus of future research
Tweet media one
1
0
5
@thomaspocon
Thomas O'Connell
2 years
Remarkably, even a standard resnet50 CNN architecture showed a marked jump in alignment to humans when trained with a multi-view learning objective (Multi-View CNN).
1
0
3
@thomaspocon
Thomas O'Connell
2 years
Models trained with a 3D multi-view objective (Multi-View Autoencoder & Multi-View CNN), in which two images depicting the same object from different viewpoints must be associated, were markedly more aligned to humans, approaching the performance of the 3D LFNs!
Tweet media one
1
0
6
@thomaspocon
Thomas O'Connell
2 years
We rule out training on rendered shapes (all control models), generative capabilities (Autoencoder), and viewpoint supervision (Autoencoder+Viewpoints)Ā  as sufficient for learning human-aligned 3D shape representations (see plot below).
1
0
4
@thomaspocon
Thomas O'Connell
2 years
Now for the fun part! What drives alignment between 3D LFNs and humans? To figure this out, we trained a series of control models, each incorporating different aspects of 3D LFNs into more standard architectures.
1
0
4
@thomaspocon
Thomas O'Connell
2 years
To ensure these results aren’t driven by the category structure of ShapeNet, we repeat the procedure using abstract procedurally-generated shapes (top) created with the ShapeGenerator plugin for Blender. Again, we the 3D LFN is most aligned to human 3D shape judgements (bottom)
Tweet media one
1
0
4
@thomaspocon
Thomas O'Connell
2 years
Next, we construct adversarial match-to-sample trials by minimizing the accuracy for 25 ImageNet CNNs and selecting trials across 5 difficulty conditions. Even for these adversarially-defined trials, alignment between human and 3D LFN 3D shape judgements holds
Tweet media one
1
0
4
@thomaspocon
Thomas O'Connell
2 years
….and find that features from the 3D LFN (yellow, pink) are much more aligned to human 3D shape judgements than our baseline models for random within-category pairs of objects!
Tweet media one
1
0
4
@thomaspocon
Thomas O'Connell
2 years
We train a 3D Light Field Network (@vincesitzmann) on ShapeNet using a multi-view rendering objective….
Tweet media one
1
0
5
@thomaspocon
Thomas O'Connell
2 years
Much progress has been driven by 3D neural fields, which learn the continuous function defining the shape of an object. We focus on conditional neural fields that compute 3D shape from images, rather than NeRF-style models that optimize directly on many viewpoints of one scene.
1
0
3
@thomaspocon
Thomas O'Connell
2 years
But should we expect DNNs trained on large corpora of natural images to incidentally learn 3D shape? Recent progress in 3D graphics and computer vision suggests not and that additional inductive biases are necessary to capture 3D geometry. .
1
0
3
@thomaspocon
Thomas O'Connell
2 years
For standard DNNs, we use 25 ImageNet CNNs, 25 ImageNet ViTs, and 3 Stylized-ImageNet CNNs (Geirhos et al. 2019). We evaluate accuracy (x-axis) and trial-wise similarity to humans (y-axis). While humans perform well, standard DNNs struggle to make human-like 3D inferences
Tweet media one
1
0
3