Onur Keleş
@Onr_Kls
Followers
735
Following
2K
Media
96
Statuses
464
PhD student of Linguistics, Research Assistant @UniBogazici | Interested in visual-gestural modality, quantitative linguistics and NLP
Istanbul/Turkey
Joined June 2020
Can VLMs detect #iconic form-meaning mappings in signs? Delighted to be part of this fantastic project led by @Onr_Kls
I’m very happy to share our new paper! “The Visual Iconicity Challenge: Evaluating Vision–Language Models on Sign Language Form–Meaning Mapping”, co-authored with @ozyurek_a, @ortega_ger, @KadirGokgoz, and Esam Ghaleb. #Iconicity #MultimodalAI arXiv: https://t.co/cJJ9Jhd5QL
0
1
4
We hope this benchmark sparks deeper collaboration between sign language linguistics and multimodal AI, highlighting signed languages as a rich testbed for visual grounding and embodiment.
0
0
2
Even the best model (Gemini 2.5 Pro) identified only 17/96 signs (~18%), far below human baselines (40/96 hearing non-signers). Also, models favor static iconic objects over dynamic iconic actions, showing a key gap between visual AI and embodied cognition, unlike humans. ❌
1
0
2
We evaluated 13 VLMs (3 closed-source). Larger models (GPT-5, Gemini 2.5 Pro, Qwen2.5-VL 72B) showed moderate correlation with human iconicity judgments and mirrored some human phonological difficulty patterns, e.g., handshape harder than location.
1
0
0
The benchmark has three complementary tasks: 1️⃣ Phonological form prediction – predicting handshape, location, etc. 2️⃣ Transparency – inferring meaning from visual form. 3️⃣ Graded iconicity – rating how much a sign looks like what it means.
1
0
0
We introduce the Visual Iconicity Challenge, a benchmark testing whether Vision–Language Models (VLMs) can recognize iconicity, i.e., the visual resemblance between form and meaning, using signs from the Sign Language of the Netherlands (NGT).
1
0
0
I’m very happy to share our new paper! “The Visual Iconicity Challenge: Evaluating Vision–Language Models on Sign Language Form–Meaning Mapping”, co-authored with @ozyurek_a, @ortega_ger, @KadirGokgoz, and Esam Ghaleb. #Iconicity #MultimodalAI arXiv: https://t.co/cJJ9Jhd5QL
1
1
13
So much happening this week! I was beyond excited to present our vision language model bencmarking study on sign iconicity @MPI_NL in Nijmegen, and then I traveled to Prague for #AMLaP2025 to report our Turkish good-enough parsing studies with L1 & L2 speakers and LLMs.
0
0
15
YKS sonuçları açıklandığına göre Boğaziçi Üniversitesi @UniBogazici, Dilbilim veya İngilizce Öğretmenliği bölümleri özelinde sorular varsa DM’den yardımcı olabilirim. Herkese başarılar, tebrikler!
0
0
5
Two papers with @dinctopal_deniz accepted to AMLaP 2025! 🥰 Both papers report the presence of good-enough parsing effects in Turkish. One focuses on task effects among L1 Turkish speakers and language models, and the other on L2 processing of role reversals.
1
0
14
Yöneylem Araştırmaları'nı @mhsatman hocam o kadar güzel anlattı ki, artık hayatımdaki tüm sorunları 'kısıt' olarak görüyorum. Aşk hayatım bile optimizasyon problemi oldu 💔
1
1
11
Low resource does not mean low potential! Had a great time presenting our workshop paper @naaclmeeting on Hamshetsnag POS and NER, showing the importance of including linguistic typology in NLP applications. 🚨 #naacl #NLProc
Can BERT help save endangered languages? Excited to present this paper with @lambdabstract and @BeratDogan03 at LM4UC @naaclmeeting! We explored how multilingual BERT with augmented data perform POS tagging & NER for Hamshentsnag #NAACL 🔗 Paper: https://t.co/RduxBPpcET
0
0
5
Had a great time presenting our role reversal work at the #CMCL workshop @naaclmeeting @naacl. Got great feedback too! 🥳 Paper link: https://t.co/3wogXEP6S2
Are your LLMs good-enough? 🤔 Our new paper w/ @dinctopal_deniz at #CMCL2025 @naaclmeeting shows both humans & smaller LLMs do good-enough parsing in Turkish role-reversal contexts. GPT-2 better predicts human RTs. LLaMA-3 does less heuristic parses but lacks predictive power.
1
0
12
Can BERT help save endangered languages? Excited to present this paper with @lambdabstract and @BeratDogan03 at LM4UC @naaclmeeting! We explored how multilingual BERT with augmented data perform POS tagging & NER for Hamshentsnag #NAACL 🔗 Paper: https://t.co/RduxBPpcET
1
3
6
Our attention analysis revealed GPT-2 models often shifted attention toward semantically plausible but syntactically incorrect noun phrases in reversed orders. LLaMA-3 maintained more stable attention patterns, suggesting syntactic but less human-like processing.
0
0
0
We then tested 3 Turkish LLMs (GPT-2-Base, GPT-2-Large, LLaMA-3) on the same stimuli, measuring surprisal and attention patterns. GPT-2-Large surprisal significantly predicted human reading times at critical regions, while LLaMA-3 surprisal did not.
1
0
0
Despite Turkish having explicit morphosyntactic features like accusative case marking and the agentive postposition "tarafından" (by), participants still made interpretation errors 25% of the time for implausible but grammatical sentences, confirming good-enough parsing effects.
1
0
0
We conducted a self-paced reading experiment with native Turkish speakers processing sentences with reversed thematic roles (e.g., "the man bit the dog" instead of "the dog bit the man"), specifically testing if Turkish morphosyntactic marking prevents good-enough parsing.
1
0
0