
Greg Durrett
@gregd_nlp
Followers
7K
Following
3K
Media
84
Statuses
1K
CS professor at UT Austin. Large language models and NLP. he/him
Joined December 2017
RT @lilyychenn: Are we fact-checking medical claims the right way? 🩺🤔. Probably not. In our study, even experts struggled to verify Reddit….
0
5
0
RT @dlwh: So about a month ago, Percy posted a version of this plot of our Marin 32B pretraining run. We got a lot of feedback, both public….
0
95
0
RT @percyliang: Wrapped up Stanford CS336 (Language Models from Scratch), taught with an amazing team @tatsu_hashimoto @marcelroed @neilbba….
0
548
0
RT @xiye_nlp: There’s been hot debate about (The Illusion of) The Illusion of Thinking. My take: it’s not that models can’t reason — they j….
0
8
0
I'm excited about Leo's use of hypernetworks for data efficient knowledge editing! Tweaking what a model learns from data is very powerful & useful for other goals like alignment. Haven't seen much other work building on MEND recently, but let me know what cool stuff we missed!.
LLMs trained to memorize new facts can’t use those facts well.🤔. We apply a hypernetwork to ✏️edit✏️ the gradients for fact propagation, improving accuracy by 2x on a challenging subset of RippleEdit!💡. Our approach, PropMEND, extends MEND with a new objective for propagation.
1
2
20
RT @ZEYULIU10: LLMs trained to memorize new facts can’t use those facts well.🤔. We apply a hypernetwork to ✏️edit✏️ the gradients for fact….
0
61
0
RT @xiye_nlp: 🤔 Recent mech interp work showed that retrieval heads can explain some long-context behavior. But can we use this insight for….
0
17
0
RT @bespokelabsai: Understanding what’s in the data is a high leverage activity when it comes to training/evaluating models and agents. Th….
0
10
0
RT @cmalaviya11: Ever wondered what makes language models generate overly verbose, vague, or sycophantic responses?. Our new paper investig….
0
17
0
RT @_vaishnavh: 📢 New paper on creativity & multi-token prediction! We design minimal open-ended tasks to argue:. → LLMs are limited in cre….
0
36
0
RT @ryanmart3n: Announcing OpenThinker3-7B, the new SOTA open-data 7B reasoning model: improving over DeepSeek-R1-Distill-Qwen-7B by 33% on….
0
190
0
RT @CosmicAI_Inst: CosmicAI collab: benchmarking the utility of LLMs in astronomy coding workflows & focusing on the key research capabilit….
0
6
0
RT @Asher_Zheng00: Language is often strategic, but LLMs tend to play nice. How strategic are they really? Probing into that is key for fut….
0
8
0
RT @XllmReasonPlan: 📢Announcing 𝐭𝐡𝐞 𝐟𝐢𝐫𝐬𝐭 𝐰𝐨𝐫𝐤𝐬𝐡𝐨𝐩 𝐨𝐧 𝐭𝐡𝐞 𝐀𝐩𝐩𝐥𝐢𝐜𝐚𝐭𝐢𝐨𝐧 𝐨𝐟 𝐋𝐋𝐌 𝐄𝐱𝐩𝐥𝐚𝐢𝐧𝐚𝐛𝐢𝐥𝐢𝐭𝐲 𝐭𝐨 𝐑𝐞𝐚𝐬𝐨𝐧𝐢𝐧𝐠 𝐚𝐧𝐝 𝐏𝐥𝐚𝐧𝐧𝐢𝐧𝐠 at @COLM_conf! .We we….
0
6
0
CoT is effective for in-domain reasoning tasks, but Fangcong's work takes a nice step in improving compositional generalization of CoT reasoning. We teach models that atomic CoT skills fit together like puzzle pieces so it can then combine them in novel ways. Lots to do here!.
Solving complex problems with CoT requires combining different skills. We can do this by:.🧩Modify the CoT data format to be “composable” with other skills.🔥Train models on each skill.📌Combine those models. Lead to better 0-shot reasoning on tasks involving skill composition!
1
1
17
RT @fangcong_y10593: Solving complex problems with CoT requires combining different skills. We can do this by:.🧩Modify the CoT data format….
0
31
0
Great to work on this benchmark with astronomers in our NSF-Simons CosmicAI institute! What I like about it:.(1) focus on data processing & visualization, a "bite-sized" AI4Sci task (not automating all of research).(2) eval with VLM-as-a-judge (possible with strong, modern VLMs).
How good are LLMs at 🔭 scientific computing and visualization 🔭?. AstroVisBench tests how well LLMs implement scientific workflows in astronomy and visualize results. SOTA models like Gemini 2.5 Pro & Claude 4 Opus only match ground truth scientific utility 16% of the time. 🧵
2
3
25
RT @kanishkamisra: News🗞️. I will return to UT Austin as an Assistant Professor of Linguistics this fall, and join its vibrant community of….
0
19
0
RT @davidbau: Dear MAGA friends,. I have been worrying about STEM in the US a lot, because right now the Senate is writing new laws that cu….
0
73
0