
Alessandro Conti
@altndrr
Followers
178
Following
1K
Media
8
Statuses
60
PhD student @UniTrento, prev @Apple
Joined February 2014
Dublin, here we come! ๐ฎ๐ช Our paper "Dynamic Scoring with Enhanced Semantics for Training-Free Human-Object Interaction Detection" has been accepted to #ACMMM2025
1
2
8
Thanks to the greatest @massi_manc, @DonkeyShot21, @YimingWang107, @paolorotaphd, and @eliricci_ for their support โ couldnโt have done it without you! ๐๐ป ๐ Abstract: https://t.co/7M1SgyXc7w ๐ป Code:
0
0
4
We try prompting + reasoning strategies to flip wrong โ right and generic โ specific. They help, but itโs clear that open-world classification is still very challenging for todayโs LMMs.
1
0
1
These distinctions really matter. Most models struggle with fine-grained reasoning, but not always in obvious ways. We dig into why they fail and how to make them better, without retraining.
1
0
1
Spoiler: itโs hard. To evaluate their free-form predictions, we propose metrics that let us classify responses into 4 types: โ
Correct & specific ("Pug") โ
Correct but generic ("Dog") โ Wrong but specific ("Siamese" vs "Persian") โ Wrong & generic ("Plate" vs "Tiramisu")
1
0
2
We tested 13 Large Multimodal Models across 10 datasets โ from prototypical to very fine-grained โ using natural language instead of fixed label sets. Can LMMs answer: "What type of object is in the image?" โฆwithout a predefined list?
1
0
1
What if we stopped treating image classification like a multiple-choice quizโฆ โฆand just asked the model: "Whatโs in this image?" Our paper on open-world classification with LMMs got into #ICCV2025! ๐๐บ Letโs talk failures, insights, and flipping mistakes ๐
3
6
24
I also got a minimalistic reviewer. Guess the rating
0
0
13
Our T3AL ๐ฆ poster has been picked by @CVPR daily! Come by and talk to @bliberatori_ @altndrr @paolorotaphd me, and @eliricci_ in the afternoon session!
0
5
15
This is incredible๐คฏ At test-time, ONE unlabelled video is enough to teach CLIP-like models to localize actions in time despite their lack of prior knowledge of the time domain. More on our @CVPR paper below ๐
๐ค Can we localize & classify actions in untrimmed videos without training data? We introduce T3AL๐ฆ, a method that performs zero-shot temporal action localization by learning only at test-time on unlabeled video data. Accepted @ #CVPR2024 ! Webpage: https://t.co/qlkkqjC5CQ
0
1
13
Glad to announce that the 1st Workshop on Green Foundation Models (GreenFOMO) is happening @eccvconf Milan, 2024๐ฅณ Can't wait to see what impact our community can make towards a green world ๐and an inclusive AI๐! Official website + CfP coming soon...Stay tuned๐
2
12
32
Cool universal classifier that leverage a knowledge database. You take a pic and get a class, without having defined the classes in advance.
1
2
15
Card design by @taapstudio, check out their Instagram profile ( https://t.co/D7x8xDpQI6)! ๐ฅ
0
0
3
Do you know the names of these animals? ๐ฆ๐๐ Tomorrow morning at #NeurIPS (poster #201), we will gift these cards and more to try the demo of our method CaSED! ๐ฅ Try it now ( https://t.co/GgDqhgijaq) and classify images with CLIP without a predefined list of class names! ๐
2
3
18
I'll present our work on vocabulary-free image classification in a few minutes at the @SFScon. Streaming here: https://t.co/Hx1kWYgiQ7
@cimec_unitrento @UniTrento_DISI
sfscon.it
0
3
12
๐ Job Opportunity! ๐ Join my research team at @UniTrento_DISI as a Postdoctoral Researcher in Computer Vision ๐ฅ๏ธ working on a project for creating Bias-Free models for Visual Recognition ๐ธ If you are interested contact me by email! ๐ #AI #ComputerVision #ResearchJobs
4
33
134
So happy to share that our paper got accepted at #NeurIPS 2023! ๐ Preprint: https://t.co/KD9vC82WV6 Code: https://t.co/37srbGw4dh Demo and model on @huggingface ๐ Thanks again to the greatest co-authors @DonkeyShot21 @massi_manc @paolorotaphd @YimingWang107 and @eliricci_! ๐ค
github.com
Code implementation of our NeurIPS 2023 paper: Vocabulary-free Image Classification - altndrr/vic
Vocabulary-free Image Classification paper page: https://t.co/yVVZ6arVU3 demo: https://t.co/OzXKfZIEup Recent advances in large vision-language models have revolutionized the image classification paradigm. Despite showing impressive zero-shot capabilities, a pre-defined set of
4
1
37
Paper is out! ๐๐ #ICCV2023
Our Source-free Video Domain Adaptation work accepted at #ICCV2023 is available online, and code will be released soon! ๐ https://t.co/TEC5tUxG6L ๐ป
0
0
5