Ido Cohen Profile
Ido Cohen

@IdoC0hen

Followers
23
Following
5
Media
1
Statuses
9

Joined March 2020
Don't wanna be here? Send us removal request.
@IdoC0hen
Ido Cohen
2 months
Huge thanks to my co-authors @dhgottesman @megamor2 and @RGiryes! If you'll be at #acl2025 , I'd love to connect and chat! Read the full paper here: https://t.co/w1rRELq0aa Explore PopVQA here: https://t.co/4gTHqIcaVI #acl2025 #NLP #MachineLearning
Tweet card summary image
huggingface.co
0
1
6
@IdoC0hen
Ido Cohen
2 months
This late processing creates a bottleneck. By the time the model figures out what it's seeing, there are very few layers left for reasoning about it.
1
1
6
@IdoC0hen
Ido Cohen
2 months
Our experiments reveal that VLMs use most of their processing power just for Hop 1. We found that critical image information is processed very late—in the model's middle layers.
1
1
5
@IdoC0hen
Ido Cohen
2 months
So why the gap? We found that reasoning about a visual entity behaves like a multi-hop problem: Hop 1: Identify the entity in the image. Hop 2: Connect the recognized entity to its stored factual knowledge and extract it.
1
1
6
@IdoC0hen
Ido Cohen
2 months
What makes PopVQA special? It’s designed to separate the task of identifying an entity from the task of reasoning about it, by providing the identity of the entity in the image instead of just the answers to questions, allowing to filter out unrecognized entities.
1
0
7
@IdoC0hen
Ido Cohen
2 months
To investigate this, we built and released a new dataset: PopVQA. It contains over 15,000 popular entities, from celebrities and landmarks to paintings and brands, each with a set of factual questions.
1
0
6
@IdoC0hen
Ido Cohen
2 months
We discovered that when you show VLMs an entity in a picture instead of just writing its name, their accuracy on factual questions drops by up to 18% for some models!
1
0
6
@IdoC0hen
Ido Cohen
2 months
A Vision-Language Model can answer questions about Robin Williams. It can also recognize him in a photo. So why does it FAIL when asked the same questions using his photo instead of his name? A thread on our new #acl2025 paper that explores this puzzle 🧵
Tweet media one
1
7
25
@IdoC0hen
Ido Cohen
3 years
Was a pleasure walking down memory lane with the team on this research! Very excited to see what theoretical and practical developments will stem from this.
@adihaviv
Adi Haviv
3 years
The cat is out of the bag🥁 LMs memorized predictions are a two-step process, and we used idioms to find that out. New dataset for probing memorization, analysis methodology, and much more. @IdoCohe49871127 @GidronJacob @RoeiSchuster @yoavgo @megamor2 https://t.co/pUzgr8Ut2q 🧵
Tweet media one
0
1
2