Yating Wu Profile
Yating Wu

@YatingWu96

Followers
257
Following
356
Media
8
Statuses
94

ECE Ph.D student @ UT Austin, advised by @jessyjli and @AlexGDimakis | EECS Rising Stars 2025

Austin, TX
Joined January 2016
Don't wanna be here? Send us removal request.
@YatingWu96
Yating Wu
2 years
LLMs can mimic human curiosity by generating open-ended inquisitive questions given some context, similar to how humans wonder when they read. But which ones are more important to be answered?🤔 We predict the salience of questions, substantially outperforming GPT-4.🌟 🧵1/5
4
25
133
@ZEYULIU10
Leo Liu
5 months
LLMs trained to memorize new facts can’t use those facts well.🤔 We apply a hypernetwork to ✏️edit✏️ the gradients for fact propagation, improving accuracy by 2x on a challenging subset of RippleEdit!💡 Our approach, PropMEND, extends MEND with a new objective for propagation.
5
72
198
@Asher_Zheng00
Asher Zheng
6 months
Language is often strategic, but LLMs tend to play nice. How strategic are they really? Probing into that is key for future safety alignment.🛟 👉Introducing CoBRA🐍, a framework that assesses strategic language. Work with my amazing advisors @jessyjli and @David_Beaver! 🧵👇
2
10
21
@sebajoed
Sebastian Joseph
6 months
How good are LLMs at 🔭 scientific computing and visualization 🔭? AstroVisBench tests how well LLMs implement scientific workflows in astronomy and visualize results. SOTA models like Gemini 2.5 Pro & Claude 4 Opus only match ground truth scientific utility 16% of the time. 🧵
1
8
23
@SashaBoguraev
Sasha Boguraev
6 months
A key hypothesis in the history of linguistics is that different constructions share underlying structure. We take advantage of recent advances in mechanistic interpretability to test this hypothesis in Language Models. New work with @kmahowald and @ChrisGPotts! 🧵👇
2
30
99
@LiyanTang4
Liyan Tang
6 months
Introducing ChartMuseum🖼️, testing visual reasoning with diverse real-world charts! ✍🏻Entirely human-written questions by 13 CS researchers 👀Emphasis on visual reasoning – hard to be verbalized via text CoTs 📉Humans reach 93% but 63% from Gemini-2.5-Pro & 38% from Qwen2.5-72B
2
34
78
@ramya_namuduri
Ramya Namuduri
7 months
Have that eerie feeling of déjà vu when reading model-generated text 👀, but can’t pinpoint the specific words or phrases 👀? ✨We introduce QUDsim, to quantify discourse similarities beyond lexical, syntactic, and content overlap.
1
18
43
@brunchavecmoi
Fangyuan Xu
9 months
Can we generate long text from compressed KV cache? We find existing KV cache compression methods (e.g., SnapKV) degrade rapidly in this setting. We present 𝐑𝐞𝐟𝐫𝐞𝐬𝐡𝐊𝐕, an inference method which ♻️ refreshes the smaller KV cache, which better preserves performance.
3
35
114
@jessyjli
Jessy Li
9 months
🌟Job ad🌟 We (@gregd_nlp, @mattlease and I) are hiring a postdoc fellow within the CosmicAI Institute, to do galactic work with LLMs and generative AI! If you would like to push the frontiers of foundation models to help solve myths of the universe, please apply!
@CosmicAI_Inst
CosmicAI
9 months
Seeking candidates for a postdoctoral position with the Explorable Universe research group to perform research on developing next-generation generative AI copilots & agents to aid astronomy research. Info here https://t.co/vU0gzWqxYz
1
25
70
@jantrienes
Jan Trienes
9 months
Do you want to know what information LLMs prioritize in text synthesis tasks? Here's a short 🧵 about our new paper: an interpretable framework for salience analysis in LLMs. First of all, information salience is a fuzzy concept. So how can we even measure it?
1
9
25
@HongliZhan
Hongli Zhan ✈️ ICML (on the job market)
9 months
Constitutional AI works great for aligning LLMs, but the principles can be too generic to apply. Can we guide responses with context-situated principles instead? Introducing SPRI, a system that produces principles tailored to each query, with minimal to no human effort. [1/5]
1
12
32
@xiye_nlp
Xi Ye
1 year
🔔 I'm recruiting multiple fully funded MSc/PhD students @UAlberta for Fall 2025! Join my lab working on NLP, especially reasoning and interpretability (see my website for more details about my research). Apply by December 15!
15
161
524
@AlexGDimakis
Alex Dimakis
1 year
https://t.co/bcckgrTPOB I’m excited to introduce Evalchemy 🧪, a unified platform for evaluating LLMs. If you want to evaluate an LLM, you may want to run popular benchmarks on your model, like MTBench, WildBench, RepoBench, IFEval, AlpacaEval etc as well as standard pre-training
9
44
237
@YatingWu96
Yating Wu
1 year
I'm thrilled to announce our paper "Which questions should I answer? Salience Prediction of Inquisitive Questions" has won an outstanding paper in EMNLP 2024🥳🥳. Thank you so much for my amazing co-authors and advisors!!! @ritikarmangla, @AlexGDimakis, @gregd_nlp, @jessyjli
7
9
103
@YatingWu96
Yating Wu
1 year
Correction: This will be happening tomorrow - Nov 13 (Wed) from 11:15 to 11:30 AM ET in conference room Flagler.
@YatingWu96
Yating Wu
1 year
✨ Exciting news! I’ll be presenting our work at #EMNLP2024 in an oral session on Nov 13 (Wed) from 10:15 to 10:30 AM (Session 6). Come say hi!
0
1
8
@YatingWu96
Yating Wu
1 year
✨ Exciting news! I’ll be presenting our work at #EMNLP2024 in an oral session on Nov 13 (Wed) from 10:15 to 10:30 AM (Session 6). Come say hi!
@YatingWu96
Yating Wu
2 years
LLMs can mimic human curiosity by generating open-ended inquisitive questions given some context, similar to how humans wonder when they read. But which ones are more important to be answered?🤔 We predict the salience of questions, substantially outperforming GPT-4.🌟 🧵1/5
1
5
15
@giannis_daras
Giannis Daras
1 year
Why are there so many different methods for using diffusion models for inverse problems? 🤔 And how do these methods relate to each other? In this survey, we review more than 35 different methods and we attempt to unify them into common mathematical formulations.
13
102
577
@SashaBoguraev
Sasha Boguraev
1 year
🎩“The math of the people, by the people, for the people, shall not perish from our models” AI math systems often abstract away from language by augmenting LLMs with symbolic solvers and logical systems. While promising, is something lost? 🧵
2
11
30
@AlexGDimakis
Alex Dimakis
1 year
One of the big problems in AI is that the systems often hallucinate. What does that mean exactly and how do we mitigate this problem, especially for RAG systems? 1. Hallucinations and Factuality Factuality refers to the quality of being based on generally accepted facts. For
2
9
40
@kanishkamisra
Kanishka Misra 🌊
1 year
🧐🔡🤖 Can LMs/NNs inform CogSci? This question has been (re)visited by many people across decades. @najoungkim and I contribute to this debate by using NN-based LMs to generate novel experimental hypotheses which can then be tested with humans!
2
14
83
@AlexGDimakis
Alex Dimakis
1 year
Excited to launch the first model from our startup: Bespoke Labs. Bespoke-Minicheck-7B is a grounded factuality checker: super lightweight and fast. Outperforms all big foundation models including Claude 3.5 Sonnet, Mistral-Large m2 and GPT 4o and its only 7B. Also, I want to
@gregd_nlp
Greg Durrett
1 year
🤔 Want to know if your LLMs are factual? You need LLM fact-checkers. �� 📣 Announcing the LLM-AggreFact leaderboard to rank LLM fact-checkers. ​ 📣 Want the best model? Check out @bespokelabsai’s’ Bespoke-Minicheck-7B model, which is the current SOTA fact-checker and is cheap and
9
36
166