Dayoon Ko
@dayoon12161
Followers
78
Following
175
Media
2
Statuses
23
M.S/Ph.D integrated student in CSE @SeoulNatlUni | Research Intern @LG_AI_Research
Seoul, Republic of Korea
Joined October 2023
🚨 Excited to share that our paper was accepted to #ACL2025 Findings 🎉 "When Should Dense Retrievers Be Updated in Evolving Corpora? Detecting Out-of-Distribution Corpora Using GradNormIR" Huge thanks to my amazing collaborators! 🙌 @jinyoung__kim @ohmyksh We propose
arxiv.org
Dense retrievers encode texts into embeddings to efficiently retrieve relevant documents from large databases in response to user queries. However, real-world corpora continually evolve, leading...
0
7
38
🧐 LLMs aren’t great at judging their own correctness. ❗But history across models helps! We present Generalized Correctness Models (GCMs), which learn to predict correctness based on history, outperforming model-specific correctness and larger models' self-confidence.
🚨 Announcing Generalized Correctness Models (GCMs) 🚨Finding that LLMs have little self knowledge about their own correctness, we train an 8B GCM to predict correctness of many models, which is more accurate than training model-specific CMs, and outperforms a larger
0
18
32
🎉🎉 Super pround to share that our work is accepted to #NeurIPS2025 !! Huge thanks to all the co-authors. 👏👏
🙁 LLMs are overconfident even when they are dead wrong. 🧐 What about reasoning models? Can they actually tell us “My answer is only 60% likely to be correct”? ❗Our paper suggests that they can! Through extensive analysis, we investigate what enables this emergent ability.
2
3
31
🎉Our "FlashAdventure: A Benchmark for GUI Agents Solving Full Story Arcs in Diverse Adventure Games" is accepted to #EMNLP2025 Main!🎉 We introduce a benchmark of 2D Flash adventure games (room escape, mystery/detective, visual novel, management) for full story completion. 🧵
1
4
27
Excited to share that our paper “ChartCap: Mitigating Hallucination of Dense Chart Captioning” has been accepted as an ICCV 2025 Highlight Poster! 📜 Paper: https://t.co/xuqRZjD2sr 🤗 Dataset: https://t.co/2vLRsTky3c 🔗 Project page (WIP):
0
6
24
[1/10] 💡New Paper Alert! CoCoT: Cognitive Chain-of-Thought Prompting for Socially Grounded Vision-Language Reasoning VLMs can see-but can they use perception to infer intent or make moral decisions? Despite recent progress, VLMs still struggle with socionormative reasoning-like
1
9
18
🚀 Heading to 🇦🇹 for #ACL2025NLP! Catch our MAC 🥷 poster at #ACL2025 @aclmeeting! Say hi 👋, and let’s talk about LLM + Multimodality! Open for a coffee chat anytime ☕💬🗣️ 🗓️ July 29 (Day 2, Main Conference) ⏰ 16:00-17:30 📍 Hall 4/5 | #3743 📄 https://t.co/Xsz896nv2C
🎉Our paper "Can LLMs Deceive CLIP? Benchmarking Adversarial Compositionality of Pre-trained Multimodal Representation via Text Updates" is accepted to #ACL2025 Main!🎉 We introduce a benchmark for multimodal "deception" + LLM-based diversified attack. 🚀 Preprint coming soon!
0
3
12
[ACL 2025] Any-to-any models are often expected to be more coherent across modalities—since they handle image→text and text→image in one unified model. But does this hold up? We test it with ACON. 📄 Paper: https://t.co/5sDal7nx65 📷 data: https://t.co/wHQtAKaH3q
1
1
4
🔥 GUI agents struggle with real-world mobile tasks. We present MONDAY—a diverse, large-scale dataset built via an automatic pipeline that transforms internet videos into GUI agent data. ✅ VLMs trained on MONDAY show strong generalization ✅ Open data (313K steps) (1/7) 🧵 #CVPR
2
15
48
🚨New Paper Alert🚨 Excited to share our new video game benchmark, "Orak"! 🕹️ It was a thrilling experience to test whether LLM/VLM agents can solve real video games 🎮 Looking forward to continuing my research on LLM/VLM-based game agents with @Krafton_AI !
As a video gaming company, @Krafton_AI has secretly been cooking something big with @NVIDIAAI for a while! 🥳 We introduce Orak, the first comprehensive video gaming benchmark for LLMs! https://t.co/GYaSIrHBTE
1
7
20
When Should Dense Retrievers Be Updated in Evolving Corpora? Detecting Out-of-Distribution Corpora Using GradNormIR @dayoon12161 et al. introduce an unsupervised approach to detect when dense retrievers need updates. 📝 https://t.co/UjhWHJne2t 👨🏽💻 https://t.co/YswThJMAwi
github.com
Contribute to dayoon-ko/gradnormir development by creating an account on GitHub.
0
1
2
🙁 LLMs are overconfident even when they are dead wrong. 🧐 What about reasoning models? Can they actually tell us “My answer is only 60% likely to be correct”? ❗Our paper suggests that they can! Through extensive analysis, we investigate what enables this emergent ability.
9
49
302
🎉Our paper "Can LLMs Deceive CLIP? Benchmarking Adversarial Compositionality of Pre-trained Multimodal Representation via Text Updates" is accepted to #ACL2025 Main!🎉 We introduce a benchmark for multimodal "deception" + LLM-based diversified attack. 🚀 Preprint coming soon!
0
7
46
Fantastic Paper from @GoogleDeepMind. Astute RAG enhances LLM performance by resolving conflicts between internal and external knowledge sources. Original Problem 🔍: RAG systems face challenges from imperfect retrieval, introducing irrelevant or misleading information.
5
123
730
After going to NAACL, ACL and #EMNLP2024 this year, here are a few tips I’ve picked up about attending *ACL conferences. Would love to hear any other tips if you have them! 🙂 1. This might be obvious, but I suggest showing everyone the same respect and interest regardless of
14
39
350
All set in Miami for #EMNLP2024 #EMNLP✈️ I'll be presenting DynamicER today at: 📅 November 12 (Tue) 🕓 16:00–17:30 📍 Poster Session C, Riverfront Hall I'm also applying to PhD programs this year—looking forward to connecting and chatting with everyone at #EMNLP2024!
0
2
8