Sharon Y. Li @SharonYixuanLi X Profile

Sharon Y. Li

@SharonYixuanLi

Followers

11K

Following

2K

Media

130

Statuses

870

Associate Professor @WisconsinCS. Making AI reliable for the open world.

https://t.co/71r9uLgXHp

Madison, WI

Joined March 2019

Don't wanna be here? Send us removal request.

Sharon Y. Li

@SharonYixuanLi

1 hour

Human preference data is noisy: inconsistent labels, annotator bias, etc. No matter how fancy the post-training algorithm is, bad data can sink your model. 🔥 @Samuel861025 and I are thrilled to release PrefCleanBench — a systematic benchmark for evaluating data cleaning

1

2

10

Sharon Y. Li

@SharonYixuanLi

24 hours

Took on the challenge of putting together three different keynote talks for the upcoming #ICCV2025 workshops...and here are the titles: 🔍 Explainability Meets Reliability in Large Vision-Language Models — eXCV Workshop ( https://t.co/SVlawIzK6v) October 19, 10:15–10:45 Honolulu

excv-workshop.github.io

eXCV Workshop at ICCV 2025

1

0

29

Sean Xuefeng Du

@xuefeng_du

22 days

📣 Announcing two calls for postdocs and research assistants / interns in my lab at NTU Singapore! 1. NTU AI-for-X Postdoctoral Fellowship is accepting postdoc applications who is jointly supervised by AI faculty and a project mentor in their own research field (X) at NTU. It

1

7

34

Fonz Morris

@youngfonz

5 days

Hey y’all, I actually did it. My book Self Design: A Memoir from Brooklyn to Silicon Valley is out now. It’s real, it’s personal, and the whole story’s in there. If you ever rooted for me (or just nosy), go grab a copy so I can stop texting people about it.😂

3

2

33

Sharon Y. Li

@SharonYixuanLi

2 days

Thanks @DanHendrycks for leading this. Check out our latest preprint on a definition of AGI.

Dan Hendrycks

@DanHendrycks

2 days

The term “AGI” is currently a vague, moving goalpost. To ground the discussion, we propose a comprehensive, testable definition of AGI. Using it, we can quantify progress: GPT-4 (2023) was 27% of the way to AGI. GPT-5 (2025) is 58%. Here’s how we define and measure it: 🧵

0

2

18

Sharon Y. Li

@SharonYixuanLi

5 days

Check out our recent work led by @LeitianT with the @metaai team on using hybrid RL for mathematical reasoning tasks. 🔥Hybrid RL offers a promising way to go beyond purely verifiable rewards — combining the reliability of verifier signals with the richness of learned feedback.

Jason Weston

@jaseweston

5 days

Hybrid Reinforcement (HERO): When Reward Is Sparse, It’s Better to Be Dense 🦸‍♂️ 💪 📝: https://t.co/VAXtSC4GGp - HERO bridges 0–1 verifiable rewards and dense reward models into one 'hybrid' RL method - Tackles the brittleness of binary signals and the noise of pure reward

2

14

151

Sharon Y. Li

@SharonYixuanLi

6 days

We hear increasing discussion about aligning LLM with “diverse human values.” But what’s the actual price of pluralism? 🧮 In our #NeurIPS2025 paper (with @shawnim00), we move this debate from the philosophical to the measurable — presenting the first theoretical scaling law

7

33

282

Amp Pay

@GetAmpPay

2 days

With Amp Pay you get... - Access to high yield (8.65% 30d avg.) - Free Black Card (spend anywhere Visa is accepted) - Free global p2p payments - Self-directed Portfolio management Fullstack finance, all in one app! ⚡️

2

10

29

Sharon Y. Li

@SharonYixuanLi

7 days

Your LVLM says: “There’s a cat on the table.” But… there’s no cat in the image. Not even a whisker. This is object hallucination — one of the most persistent reliability failures in multi-modal language models. Our new #NeurIPS2025 paper introduces GLSim, a simple but

3

44

228

Sharon Y. Li

@SharonYixuanLi

13 days

Collecting large human preference data is expensive—the biggest bottleneck in reward modeling. In our #NeurIPS2025 paper, we introduce latent-space synthesis for preference data, which is 18× faster and uses a network that’s 16,000× smaller (0.5M vs 8B parameters) than

4

57

318

Sharon Y. Li

@SharonYixuanLi

15 days

I will be giving a talk at UPenn CIS Seminar next Tuesday, October 7. More info below https://t.co/RdnTrxfrjk thanks @weijie444 for hosting!

3

14

124

Sharon Y. Li

@SharonYixuanLi

15 days

Excited to share our #NeurIPS2025 paper: Visual Instruction Bottleneck Tuning (Vittle) Multimodal LLMs do great in-distribution, but often break in the wild. Scaling data or models helps, but it’s costly. 💡 Our work is inspired by the Information Bottleneck (IB) principle,

2

34

243

Variant

@VariantMkt

11 days

🚀Introducing the future of investing! Now open. Built by hedge funds for the next generation of investors.

6

11

Sharon Y. Li

@SharonYixuanLi

21 days

Multi-Agent Debate (MAD) has been hyped as a collaborative reasoning paradigm — but let me drop the bomb: majority voting, without any debate, often performs on par with MAD. This is what we formally prove in our #NeurIPS2025 Spotlight paper: “Debate or Vote: Which Yields

11

70

455

Sharon Y. Li

@SharonYixuanLi

22 days

Everyday human conversation can be filled with intent that goes unspoken, feelings implied but never named. How can AI ever really understand that? ✨ We’re excited to share our new work MetaMind — just accepted to #NeurIPS2025 as a Spotlight paper! A thread 👇 1️⃣ Human

11

60

347