
Tanya Goyal
@tanyaagoyal
Followers
2K
Following
716
Media
15
Statuses
180
NLP-ing @Cornell_CS (since Fall 2024). she/her
Austin, Texas
Joined September 2019
RT @MohitIyyer: GPT-5 lands first place on NoCha, our long-context book understanding benchmark. That said, this is a tiny improvement (~1….
0
11
0
RT @OwainEvans_UK: New paper & surprising result. LLMs transmit traits to other models via hidden signals in data. Datasets consisting only….
0
1K
0
RT @leqi_liu: What if you could understand and control an LLM by studying its *smaller* sibling?. Our new paper proposes the Linear Represe….
0
15
0
RT @chrome1996: Have you noticed….🔍 Aligned LLM generations feel less diverse?.🎯 Base models are decoding-sensitive?.🤔 Generations get more….
0
29
0
RT @wzhao_nlp: It's time to think about code generation beyond functional correctness. Refactoring multiple libraries requires designing AP….
0
4
0
RT @ZEYULIU10: LLMs trained to memorize new facts can’t use those facts well.🤔. We apply a hypernetwork to ✏️edit✏️ the gradients for fact….
0
65
0
RT @anmol_mekala: 📢 New Paper 📢.Struggling to fit in very long contexts on your LLM? Considering 4-bit quantization to 2x your context wind….
0
14
0
RT @PhilippeLaban: 🆕paper: LLMs Get Lost in Multi-Turn Conversation. In real life, people don’t speak in perfect prompts. So we simulate mu….
0
32
0
Check out Oliver's paper on learning new knowledge and resolving knowledge conflicts in LLMs!. Surprising finding: conditioning on self-generated contexts during training gives massive performance gains! We are excited to extend this ideas to other domains!.
🤯 GPT-4o knows H&M left Russia in 2022 but still recommends shopping at H&M in Moscow. 🤔 LLMs store conflicting facts from different times, leading to inconsistent responses. We dig into how to better update LLMs with fresh facts that contradict their prior knowledge. 🧵 1/6
0
4
22
RT @Oliver54244160: 🤯 GPT-4o knows H&M left Russia in 2022 but still recommends shopping at H&M in Moscow. 🤔 LLMs store conflicting facts….
0
10
0
RT @kabirahuja004: 📢 New Paper!. Tired 😴 of reasoning benchmarks full of math & code? In our work we consider the problem of reasoning for….
0
51
0
RT @wzhao_nlp: Time to revisit our paper: Open community-driven evaluation platforms could be corrupted from a few sources of bad annotatio….
0
7
0
RT @brunchavecmoi: Can we generate long text from compressed KV cache? We find existing KV cache compression methods (e.g., SnapKV) degrade….
0
31
0
RT @_awettig: 🤔 Ever wondered how prevalent some type of web content is during LM pre-training?. In our new paper, we propose WebOrganizer….
0
57
0
RT @srush_nlp: This year, I have an exceptional student on the academic market. Wenting Zhao (@wzhao_nlp) builds systems that reason in na….
0
65
0
Getting high-quality human annotations is always tricky, even for targeted domains/tasks. Check out @wzhao_nlp's work where we analyze how this manifests in open community data collection efforts with minimal quality checks by design.
Eval platforms like Chatbot Arena attract users to provide preference votes. But what are the incentives of these users? Are they apathetic, or are they adversarial and just aiming to inflate their model rankings? We show 10% adversarial votes change the model rankings by a lot!
0
3
23
RT @wzhao_nlp: Eval platforms like Chatbot Arena attract users to provide preference votes. But what are the incentives of these users? Are….
0
18
0
RT @niloofar_mire: I'm on the faculty market and at #NeurIPS!👩🏫. I work on privacy, memorization, and emerging cha….
0
87
0
RT @jwthickstun: I am recruiting PhD students for Fall '25 at Cornell! I plan to admit multiple students interested in building more contro….
0
47
0