
Eunsol Choi
@eunsolc
Followers
6K
Following
5K
Media
3
Statuses
133
on natural language processing / machine learning. assistant prof at @NYUDataScience @NYU_Courant prev @UTCompSci @googleai, @uwcse, @Cornell.
Joined September 2016
My lab will move to @NYUDataScience and @NYU_Courant this Fall! I’m excited to connect with amazing researchers at @CILVRatNYU and larger ML/NLP community in NYC. I will be recruiting students this cycle at NYU. Happy to be back to the city 🗽on the east coast as well. I had a.
CDS welcomes Eunsol Choi (@eunsolc) as an Assistant Professor of Computer Science (@NYU_Courant) and Data Science! . Her research focuses on advancing how computers interpret human language in real-world contexts.
54
48
523
Knowledge propagation in LLM is notoriously challenging. Check out our paper that improves it substantially by training a hypernetwork to target knowledge propagation!.
LLMs trained to memorize new facts can’t use those facts well.🤔. We apply a hypernetwork to ✏️edit✏️ the gradients for fact propagation, improving accuracy by 2x on a challenging subset of RippleEdit!💡. Our approach, PropMEND, extends MEND with a new objective for propagation.
1
8
94
RT @hungting_chen: I will be presenting this work at NAACL 2025! More specifically at 5pm on Thursday (May 1st), at Ruidoso. (Session IIS 1….
0
8
0
RT @thomlake: Interested in how alignment changes the response distribution defined by LLMs? Come check out my poster at 2 PM at #NAACL2025….
0
6
0
Please check out Michael's #ICLR2025 poster on training LLMs to ask clarifying questions. LLMs are eager to answer immediately, even when the input is ambiguous. We simulate future turns and then assign rewards based on it, teaching LLMs to see a value in asking clarifying.
Can your LLM ask clarifying questions for ambiguous prompts?. Come see our #ICLR25 poster this afternoon where I’ll chat about why RLHF'd LMs fail to ask clarifying questions to resolve ambiguity, and why they often confidently respond to only one interpretation instead. More ⬇️
0
3
40
Would LLMs think "Houston, Austin, Dallas" is sampled from "cities in Texas" rather than "cities in the US"? I really enjoyed our work exploring reasoning of LLMs about these suspicious coincidences!.
Are LMs sensitive to suspicious coincidences? Our paper finds that, when given access to knowledge of the hypothesis space, LMs can show sensitivity to such coincidences, displaying parallels with human inductive reasoning. w/@kanishkamisra, @kmahowald, @eunsolc
0
3
30
RT @COLM_conf: We are receiving repeating questions about the double submission policy in relation to the abstract deadline. Our FAQ addres….
0
2
0
Can we generate speech that aligns with abstract, rich style tags (e.g., confused, authoritative)? Anuj's new work makes a step towards it through careful data augmentation!.
Introducing ParaSpeechCaps, our large-scale style captions dataset that enables rich, expressive control for text-to-speech models!.Beyond basic pitch or speed controls, our models can generate speech that sounds "guttural", "scared", "whispered" and more; 59 style tags in total.
0
2
17
RT @COLM_conf: Excited to announce our 2025 keynote speakers: @cosmo_shirley, Nicholas Carlini, @LukeZettlemoyer, and Tom Griffiths! https:….
0
14
0
When using LLM-as-a-judge, practitioners often use greedy decoding to get the most likely judgment. But we found that deriving a score from the judgment distribution (like taking the mean) consistently outperforms greedy decoding. Check out @victorwang37's thorough study!.
LLM judges have become ubiquitous, but valuable signal is often ignored at inference. We analyze design decisions for leveraging judgment distributions from LLM-as-a-judge: 🧵. w/ @mjqzhang @eunsolc!
0
1
46
Check out our new paper on KV compression for long text generation. Key insight: small KV cache needs to be refreshed occasionally!.
Can we generate long text from compressed KV cache? We find existing KV cache compression methods (e.g., SnapKV) degrade rapidly in this setting. We present 𝐑𝐞𝐟𝐫𝐞𝐬𝐡𝐊𝐕, an inference method which ♻️ refreshes the smaller KV cache, which better preserves performance.
1
6
65
It was fun exploring augmenting in-context examples to retrieval (text embedding) models with @atu_tej @yoonsang_ @sujaysanghavi! It doesn't work as magically with LLMs out-of-the-box, but in-context examples can help after fine-tuning.
Introducing RARe: Retrieval Augmented Retrieval with In-Context Examples! . 1/ Can retrieval models be trained to use in-context examples like LLMs? 🤔 Our preprint answers yes-showing unto +2.72% nDCG on open-domain retrieval benchmarks!🧵. w @yoonsang_ @sujaysanghavi @eunsolc
0
1
36
We studied retrieval diversity on subjective questions with different types of corpus (Wikipedia, web snapshot, search results)! This project made me think a lot about the future of retrieval system evaluations.
🚨New Paper🚨: We introduce BERDS, a BEnchmark for Retrieval Diversity, for subjective questions. We collect subjective questions with diverse perspectives and develop evaluation metrics to measure retrieval diversity in an open-world setting. Work done w/ @eunsolc!. 🧵
0
6
63
RT @NYUDataScience: CDS Faculty Fellow opening:. Seeking interdisciplinary faculty fellows in ML, cognitive science, theory, responsible AI….
0
18
0
RT @ZayneSprague: To CoT or not to CoT?🤔. 300+ experiments with 14 LLMs & systematic meta-analysis of 100+ recent papers. 🤯Direct answering….
0
67
0
RT @ManlingLi_: Tomorrow is the day! We cannot wait to see you at #ACL2024 @aclmeeting Knowledgeable LMs workshop!. Super excited for keyno….
0
16
0
RT @mina1004h: VLMs can generate long-form answers to visual questions (LFVQA). What information do these long-form answers contain? How ca….
0
24
0
RT @brunchavecmoi: 🥝KIWI at #ACL2024 : Check out our poster and talk to my collaborators @kylelostat @soldni !.
0
6
0
RT @yoonsang_: Accepted at @COLM_conf with scores of 9/8/7/6 🎉 .We show current LMs struggle to handle multiple documents featuring confusi….
0
5
0
RT @hungting_chen: Our paper has been accepted by @COLM_conf🎉! Our analysis reveals behaviors of LM when generating long-form answers with….
0
4
0