
Ido Cohen
@IdoC0hen
Followers
23
Following
5
Media
1
Statuses
9
Joined March 2020
Huge thanks to my co-authors @dhgottesman @megamor2 and @RGiryes! If you'll be at #acl2025 , I'd love to connect and chat! Read the full paper here: https://t.co/w1rRELq0aa Explore PopVQA here: https://t.co/4gTHqIcaVI
#acl2025 #NLP #MachineLearning
huggingface.co
0
1
6
This late processing creates a bottleneck. By the time the model figures out what it's seeing, there are very few layers left for reasoning about it.
1
1
6
Our experiments reveal that VLMs use most of their processing power just for Hop 1. We found that critical image information is processed very late—in the model's middle layers.
1
1
5
So why the gap? We found that reasoning about a visual entity behaves like a multi-hop problem: Hop 1: Identify the entity in the image. Hop 2: Connect the recognized entity to its stored factual knowledge and extract it.
1
1
6
What makes PopVQA special? It’s designed to separate the task of identifying an entity from the task of reasoning about it, by providing the identity of the entity in the image instead of just the answers to questions, allowing to filter out unrecognized entities.
1
0
7
To investigate this, we built and released a new dataset: PopVQA. It contains over 15,000 popular entities, from celebrities and landmarks to paintings and brands, each with a set of factual questions.
1
0
6
We discovered that when you show VLMs an entity in a picture instead of just writing its name, their accuracy on factual questions drops by up to 18% for some models!
1
0
6
Was a pleasure walking down memory lane with the team on this research! Very excited to see what theoretical and practical developments will stem from this.
The cat is out of the bag🥁 LMs memorized predictions are a two-step process, and we used idioms to find that out. New dataset for probing memorization, analysis methodology, and much more. @IdoCohe49871127 @GidronJacob @RoeiSchuster @yoavgo @megamor2
https://t.co/pUzgr8Ut2q 🧵
0
1
2