Dayoon Ko @dayoon12161 X Profile

Dayoon Ko

@dayoon12161

Followers

78

Following

175

Media

2

Statuses

23

M.S/Ph.D integrated student in CSE @SeoulNatlUni | Research Intern @LG_AI_Research

Seoul, Republic of Korea

Joined October 2023

Don't wanna be here? Send us removal request.

Dayoon Ko

@dayoon12161

5 months

🚨 Excited to share that our paper was accepted to #ACL2025 Findings 🎉 "When Should Dense Retrievers Be Updated in Evolving Corpora? Detecting Out-of-Distribution Corpora Using GradNormIR" Huge thanks to my amazing collaborators! 🙌 @jinyoung__kim @ohmyksh We propose

arxiv.org

Dense retrievers encode texts into embeddings to efficiently retrieve relevant documents from large databases in response to user queries. However, real-world corpora continually evolve, leading...

0

7

38

hyunji amy lee

@hyunji_amy_lee

29 days

🧐 LLMs aren’t great at judging their own correctness. ❗But history across models helps! We present Generalized Correctness Models (GCMs), which learn to predict correctness based on history, outperforming model-specific correctness and larger models' self-confidence.

Elias Stengel-Eskin

@EliasEskin

29 days

🚨 Announcing Generalized Correctness Models (GCMs) 🚨Finding that LLMs have little self knowledge about their own correctness, we train an 8B GCM to predict correctness of many models, which is more accurate than training model-specific CMs, and outperforms a larger

0

18

32

Dongkeun Yoon

@dongkeun_yoon

1 month

🎉🎉 Super pround to share that our work is accepted to #NeurIPS2025 !! Huge thanks to all the co-authors. 👏👏

Dongkeun Yoon

@dongkeun_yoon

5 months

🙁 LLMs are overconfident even when they are dead wrong. 🧐 What about reasoning models? Can they actually tell us “My answer is only 60% likely to be correct”? ❗Our paper suggests that they can! Through extensive analysis, we investigate what enables this emergent ability.

2

3

31

Jaewoo Ahn

@AHNJAEWOO2

2 months

🎉Our "FlashAdventure: A Benchmark for GUI Agents Solving Full Story Arcs in Diverse Adventure Games" is accepted to #EMNLP2025 Main!🎉 We introduce a benchmark of 2D Flash adventure games (room escape, mystery/detective, visual novel, management) for full story completion. 🧵

1

4

27

Junyoung Lim

@junyoung_00

3 months

Excited to share that our paper “ChartCap: Mitigating Hallucination of Dense Chart Captioning” has been accepted as an ICCV 2025 Highlight Poster! 📜 Paper: https://t.co/xuqRZjD2sr 🤗 Dataset: https://t.co/2vLRsTky3c 🔗 Project page (WIP):

0

6

24

Eunkyu Eunice Park

@uunicee_

3 months

[1/10] 💡New Paper Alert! CoCoT: Cognitive Chain-of-Thought Prompting for Socially Grounded Vision-Language Reasoning VLMs can see-but can they use perception to infer intent or make moral decisions? Despite recent progress, VLMs still struggle with socionormative reasoning-like

1

9

18

Jaewoo Ahn

@AHNJAEWOO2

3 months

🚀 Heading to 🇦🇹 for #ACL2025NLP! Catch our MAC 🥷 poster at #ACL2025 @aclmeeting! Say hi 👋, and let’s talk about LLM + Multimodality! Open for a coffee chat anytime ☕💬🗣️ 🗓️ July 29 (Day 2, Main Conference) ⏰ 16:00-17:30 📍 Hall 4/5 | #3743 📄 https://t.co/Xsz896nv2C

Jaewoo Ahn

@AHNJAEWOO2

6 months

🎉Our paper "Can LLMs Deceive CLIP? Benchmarking Adversarial Compositionality of Pre-trained Multimodal Representation via Text Updates" is accepted to #ACL2025 Main!🎉 We introduce a benchmark for multimodal "deception" + LLM-based diversified attack. 🚀 Preprint coming soon!

0

3

12

Jiwan Chung

@JiwanChung

3 months

[ACL 2025] Any-to-any models are often expected to be more coherent across modalities—since they handle image→text and text→image in one unified model. But does this hold up? We test it with ACON. 📄 Paper: https://t.co/5sDal7nx65 📷 data: https://t.co/wHQtAKaH3q

1

4

Yeda Song

@__runamu__

5 months

🔥 GUI agents struggle with real-world mobile tasks. We present MONDAY—a diverse, large-scale dataset built via an automatic pipeline that transforms internet videos into GUI agent data. ✅ VLMs trained on MONDAY show strong generalization ✅ Open data (313K steps) (1/7) 🧵 #CVPR

2

15

48

Eunkyu Eunice Park

@uunicee_

5 months

🎸 First time in Nashville—and it’s for #CVPR2025! Excited to present our poster: HalLoc: Token-level Localization of Hallucinations for Vision Language Models 📅 Sunday, June 15 |🕓 4:00–6:00 p.m. CDT 📍Poster #358, Exhibit Hall D We introduce HalLoc, the first dataset for

0

2

16

Jaewoo Ahn

@AHNJAEWOO2

5 months

🚨New Paper Alert🚨 Excited to share our new video game benchmark, "Orak"! 🕹️ It was a thrilling experience to test whether LLM/VLM agents can solve real video games 🎮 Looking forward to continuing my research on LLM/VLM-based game agents with @Krafton_AI !

Kangwook Lee

@Kangwook_Lee

5 months

As a video gaming company, @Krafton_AI has secretly been cooking something big with @NVIDIAAI for a while! 🥳 We introduce Orak, the first comprehensive video gaming benchmark for LLMs! https://t.co/GYaSIrHBTE

1

7

20

Sumit

@_reachsumit

5 months

When Should Dense Retrievers Be Updated in Evolving Corpora? Detecting Out-of-Distribution Corpora Using GradNormIR @dayoon12161 et al. introduce an unsupervised approach to detect when dense retrievers need updates. 📝 https://t.co/UjhWHJne2t 👨🏽‍💻 https://t.co/YswThJMAwi

github.com

Contribute to dayoon-ko/gradnormir development by creating an account on GitHub.

0

1

2

Dongkeun Yoon

@dongkeun_yoon

5 months

🙁 LLMs are overconfident even when they are dead wrong. 🧐 What about reasoning models? Can they actually tell us “My answer is only 60% likely to be correct”? ❗Our paper suggests that they can! Through extensive analysis, we investigate what enables this emergent ability.

9

49

302

Jaewoo Ahn

@AHNJAEWOO2

6 months

🎉Our paper "Can LLMs Deceive CLIP? Benchmarking Adversarial Compositionality of Pre-trained Multimodal Representation via Text Updates" is accepted to #ACL2025 Main!🎉 We introduce a benchmark for multimodal "deception" + LLM-based diversified attack. 🚀 Preprint coming soon!

0

7

46

Rohan Paul

@rohanpaul_ai

11 months

Fantastic Paper from @GoogleDeepMind. Astute RAG enhances LLM performance by resolving conflicts between internal and external knowledge sources. Original Problem 🔍: RAG systems face challenges from imperfect retrieval, introducing irrelevant or misleading information.

5

123

730

Joe Stacey

@_joestacey_

1 year

After going to NAACL, ACL and #EMNLP2024 this year, here are a few tips I’ve picked up about attending *ACL conferences. Would love to hear any other tips if you have them! 🙂 1. This might be obvious, but I suggest showing everyone the same respect and interest regardless of

14

39

350

Jinyoung Kim

@jinyoung__kim

1 year

All set in Miami for #EMNLP2024 #EMNLP✈️ I'll be presenting DynamicER today at: 📅 November 12 (Tue) 🕓 16:00–17:30 📍 Poster Session C, Riverfront Hall I'm also applying to PhD programs this year—looking forward to connecting and chatting with everyone at #EMNLP2024!

0

2

8

Dayoon Ko

@dayoon12161

1 year

https://t.co/c2TREAOdc6

0

Dayoon Ko

@dayoon12161

1 year

https://t.co/EREkx4ZB7G

1

0

Dayoon Ko

@dayoon12161

1 year

https://t.co/1DogDgLijg

1

0