dongkeun_yoon Profile Banner
Dongkeun Yoon Profile
Dongkeun Yoon

@dongkeun_yoon

Followers
464
Following
493
Media
17
Statuses
170

PhD student @kaist_ai. Researching multilinguality in LLMs.

Joined March 2022
Don't wanna be here? Send us removal request.
@dongkeun_yoon
Dongkeun Yoon
1 month
🎉🎉 Super pround to share that our work is accepted to #NeurIPS2025 !! Huge thanks to all the co-authors. 👏👏
@dongkeun_yoon
Dongkeun Yoon
5 months
🙁 LLMs are overconfident even when they are dead wrong. 🧐 What about reasoning models? Can they actually tell us “My answer is only 60% likely to be correct”? ❗Our paper suggests that they can! Through extensive analysis, we investigate what enables this emergent ability.
2
3
31
@seungonekim
Seungone Kim
12 days
We are gathering problems to build a challenging math benchmark (collaboration between @AiEleuther and @withmsit). The compensation per problem is up to ~$3,623 and the due date is Nov 10th! https://t.co/TdUG5xvTr2
@AiEleuther
EleutherAI
12 days
We are announcing an opportunity for paid question writers to contribute to a new PhD-level math benchmark. Accepted contributors will be paid per question and will be invited to be authors on the resulting dataset paper. Check out the link below for more information!
0
2
22
@alisawuffles
Alisa Liu
1 month
Every LM needs a way of encoding data, and any choice of encoding is a design choice. When using bytes, you borrow choices from the makers of UTF8, and there’s generally no reason to believe that the most common encoding on the internet is also the best one for language modeling.
@linguist_cat
Catherine Arnett
1 month
I have a new blog post about the so-called “tokenizer-free” approach to language modeling and why it’s not tokenizer-free at all. I also talk about why people hate tokenizers so much!
2
8
91
@alisawuffles
Alisa Liu
3 months
If you're at ACL, join us for our tutorial on Synthetic Data in the Era of LLMs with @vijaytarian @xiangyue96 @yizhongwyz @gneubig!! 🕑 2pm - 5:30pm 📍 Hall B
4
14
121
@hyunji_amy_lee
hyunji amy lee
4 months
🥳Excited to share that I’ll be joining @unccs as postdoc this fall. Looking forward to work with @mohitban47 & amazing students at @unc_ai_group. I'll continue working on retrieval, aligning knowledge modules with LLM's parametric knowledge, and expanding to various modalities.
20
31
161
@RicardoRei7
Ricardo Rei
4 months
🚀 Tower+: our latest model in the Tower family — sets a new standard for open-weight multilingual models! We show how to go beyond sentence-level translation, striking a balance between translation quality and general multilingual capabilities. 1/5 https://t.co/WKQapk31c0
1
9
25
@zmprcp
José Maria Pombal
4 months
Check out the latest iteration of Tower models, Tower+. Ideal for translation tasks and beyond, and available at three different scales: 2B, 9B, 72B. All available on huggingface: https://t.co/XWJqTeht7R Kudos to everyone involved!
Tweet card summary image
huggingface.co
@RicardoRei7
Ricardo Rei
4 months
🚀 Tower+: our latest model in the Tower family — sets a new standard for open-weight multilingual models! We show how to go beyond sentence-level translation, striking a balance between translation quality and general multilingual capabilities. 1/5 https://t.co/WKQapk31c0
0
1
10
@hyunji_amy_lee
hyunji amy lee
4 months
🚨 Want models to better utilize and ground on the provided knowledge? We introduce Context-INformed Grounding Supervision (CINGS)! Training LLM with CINGS significantly boosts grounding abilities in both text and vision-language models compared to standard instruction tuning.
2
47
125
@soheeyang_
Sohee Yang
5 months
🚨 New Paper 🧵 How effectively do reasoning models reevaluate their thought? We find that: - Models excel at identifying unhelpful thoughts but struggle to recover from them - Smaller models can be more robust - Self-reevaluation ability is far from true meta-cognitive awareness
3
27
131
@dayoon12161
Dayoon Ko
5 months
🚨 Excited to share that our paper was accepted to #ACL2025 Findings 🎉 "When Should Dense Retrievers Be Updated in Evolving Corpora? Detecting Out-of-Distribution Corpora Using GradNormIR" Huge thanks to my amazing collaborators! 🙌 @jinyoung__kim @ohmyksh We propose
Tweet card summary image
arxiv.org
Dense retrievers encode texts into embeddings to efficiently retrieve relevant documents from large databases in response to user queries. However, real-world corpora continually evolve, leading...
0
7
38
@Yunjae_Won_
Yunjae Won
5 months
[1/6] Ever wondered why Direct Preference Optimization is so effective for aligning LLMs? 🤔 Our new paper dives deep into the theory behind DPO's success, through the lens of information gain. Paper: "Differential Information: An Information-Theoretic Perspective on Preference
5
22
67
@shafayat_sheikh
Sheikh Shafayat
5 months
Check out our latest work on self-improving LLMs, where we try to see if LLMs can utilize their internal self consistency as a reward signal to bootstrap itself using RL. TL;DR: it can, to some extent, but then ends up reward hacking the self-consistency objective. We try to see
4
27
142
@ronalhwang
Hyeonbin Hwang
5 months
🚨 New Paper co-led with @bkjeon1211 🚨 Q. Can we adapt Language Models, trained to predict next token, to reason in sentence-level? I think LMs operating in higher-level abstraction would be a promising path towards advancing its reasoning, and I am excited to share our
4
43
168
@hoyeon_chang
Hoyeon Chang
5 months
New preprint 📄 (with @jinho___park ) Can neural nets really reason compositionally, or just match patterns? We present the Coverage Principle: a data-centric framework that predicts when pattern-matching models will generalize (validated on Transformers). 🧵👇
2
31
126
@gson_AI
arlo_son
5 months
Imagine you’re collaborating with an AI co-scientist: you ask it to proofread your manuscript and flag any errors. Which LLM would you choose? 🤔 We evaluated the new Claude 4 models on SPOT. It looks like o3 is still the best model for this.
2
5
8
@chaechaek1214
Chaeeun Kim
5 months
❓What if your RAG didn’t need a separate retrieval model at all? We present 🧊FREESON, a new framework for retriever-FREE retrieval-augmented reasoning. With FREESON, a single LRM acts as both generator and retriever, shifting the focus from seq2seq matching to locating
1
5
29
@smellslikeml
Smells Like ML
5 months
@dongkeun_yoon Congrats to the team for this fantastic work! Had a chance to try the code on my reasoning VLM and found consistent results. https://t.co/UcwzXoxuZM
@smellslikeml
Smells Like ML
5 months
Tried out the code using SpaceThinker-Qwen2.5-VL-3B Plots indicate steady increase in accuracy and confidence while reducing calibration error as CoT increases Fitting TriviaQA with linear regression, slope: 0.042, -0.034, -0.02, 0.011 all statsig
1
1
1
@dongkeun_yoon
Dongkeun Yoon
5 months
🙁 LLMs are overconfident even when they are dead wrong. 🧐 What about reasoning models? Can they actually tell us “My answer is only 60% likely to be correct”? ❗Our paper suggests that they can! Through extensive analysis, we investigate what enables this emergent ability.
9
49
302
@fly51fly
fly51fly
5 months
[CL] Reasoning Models Better Express Their Confidence D Yoon, S Kim, S Yang, S Kim... [KAIST & CMU & UCL] (2025) https://t.co/w7bwk1voj8
0
5
19
@CShorten30
Connor Shorten
5 months
@dongkeun_yoon 🔥🔥🔥
1
1
1