SharonYixuanLi Profile Banner
Sharon Y. Li Profile
Sharon Y. Li

@SharonYixuanLi

Followers
11K
Following
2K
Media
130
Statuses
870

Associate Professor @WisconsinCS. Making AI reliable for the open world.

Madison, WI
Joined March 2019
Don't wanna be here? Send us removal request.
@SharonYixuanLi
Sharon Y. Li
1 hour
Human preference data is noisy: inconsistent labels, annotator bias, etc. No matter how fancy the post-training algorithm is, bad data can sink your model. 🔥 @Samuel861025 and I are thrilled to release PrefCleanBench — a systematic benchmark for evaluating data cleaning
1
2
10
@SharonYixuanLi
Sharon Y. Li
24 hours
Took on the challenge of putting together three different keynote talks for the upcoming #ICCV2025 workshops...and here are the titles: 🔍 Explainability Meets Reliability in Large Vision-Language Models — eXCV Workshop ( https://t.co/SVlawIzK6v) October 19, 10:15–10:45 Honolulu
Tweet card summary image
excv-workshop.github.io
eXCV Workshop at ICCV 2025
1
0
29
@xuefeng_du
Sean Xuefeng Du
22 days
📣 Announcing two calls for postdocs and research assistants / interns in my lab at NTU Singapore! 1. NTU AI-for-X Postdoctoral Fellowship is accepting postdoc applications who is jointly supervised by AI faculty and a project mentor in their own research field (X) at NTU. It
1
7
34
@youngfonz
Fonz Morris
5 days
Hey y’all, I actually did it. My book Self Design: A Memoir from Brooklyn to Silicon Valley is out now. It’s real, it’s personal, and the whole story’s in there. If you ever rooted for me (or just nosy), go grab a copy so I can stop texting people about it.😂
3
2
33
@SharonYixuanLi
Sharon Y. Li
2 days
Thanks @DanHendrycks for leading this. Check out our latest preprint on a definition of AGI.
@DanHendrycks
Dan Hendrycks
2 days
The term “AGI” is currently a vague, moving goalpost. To ground the discussion, we propose a comprehensive, testable definition of AGI. Using it, we can quantify progress: GPT-4 (2023) was 27% of the way to AGI. GPT-5 (2025) is 58%. Here’s how we define and measure it: 🧵
0
2
18
@SharonYixuanLi
Sharon Y. Li
5 days
Check out our recent work led by @LeitianT with the @metaai team on using hybrid RL for mathematical reasoning tasks. 🔥Hybrid RL offers a promising way to go beyond purely verifiable rewards — combining the reliability of verifier signals with the richness of learned feedback.
@jaseweston
Jason Weston
5 days
Hybrid Reinforcement (HERO): When Reward Is Sparse, It’s Better to Be Dense 🦸‍♂️ 💪 📝: https://t.co/VAXtSC4GGp - HERO bridges 0–1 verifiable rewards and dense reward models into one 'hybrid' RL method - Tackles the brittleness of binary signals and the noise of pure reward
2
14
151
@SharonYixuanLi
Sharon Y. Li
6 days
We hear increasing discussion about aligning LLM with “diverse human values.” But what’s the actual price of pluralism? 🧮 In our #NeurIPS2025 paper (with @shawnim00), we move this debate from the philosophical to the measurable — presenting the first theoretical scaling law
7
33
282
@GetAmpPay
Amp Pay
2 days
With Amp Pay you get... - Access to high yield (8.65% 30d avg.) - Free Black Card (spend anywhere Visa is accepted) - Free global p2p payments - Self-directed Portfolio management Fullstack finance, all in one app! ⚡️
2
10
29
@SharonYixuanLi
Sharon Y. Li
7 days
Your LVLM says: “There’s a cat on the table.” But… there’s no cat in the image. Not even a whisker. This is object hallucination — one of the most persistent reliability failures in multi-modal language models. Our new #NeurIPS2025 paper introduces GLSim, a simple but
3
44
228
@SharonYixuanLi
Sharon Y. Li
13 days
Collecting large human preference data is expensive—the biggest bottleneck in reward modeling. In our #NeurIPS2025 paper, we introduce latent-space synthesis for preference data, which is 18× faster and uses a network that’s 16,000× smaller (0.5M vs 8B parameters) than
4
57
318
@SharonYixuanLi
Sharon Y. Li
15 days
I will be giving a talk at UPenn CIS Seminar next Tuesday, October 7. More info below https://t.co/RdnTrxfrjk thanks @weijie444 for hosting!
3
14
124
@SharonYixuanLi
Sharon Y. Li
15 days
Excited to share our #NeurIPS2025 paper: Visual Instruction Bottleneck Tuning (Vittle) Multimodal LLMs do great in-distribution, but often break in the wild. Scaling data or models helps, but it’s costly. 💡 Our work is inspired by the Information Bottleneck (IB) principle,
2
34
243
@VariantMkt
Variant
11 days
🚀Introducing the future of investing! Now open. Built by hedge funds for the next generation of investors.
6
6
11
@SharonYixuanLi
Sharon Y. Li
21 days
Multi-Agent Debate (MAD) has been hyped as a collaborative reasoning paradigm — but let me drop the bomb: majority voting, without any debate, often performs on par with MAD. This is what we formally prove in our #NeurIPS2025 Spotlight paper: “Debate or Vote: Which Yields
11
70
455
@SharonYixuanLi
Sharon Y. Li
22 days
Everyday human conversation can be filled with intent that goes unspoken, feelings implied but never named. How can AI ever really understand that? ✨ We’re excited to share our new work MetaMind — just accepted to #NeurIPS2025 as a Spotlight paper! A thread 👇 1️⃣ Human
11
60
347