Yike Wang @Neurips25 @yikewang_ X Profile

Yike Wang @Neurips25

@yikewang_

Followers

457

Following

226

Media

6

Statuses

97

PhD student @uwcse @uwnlp | BA, MS @berkeley_ai

https://t.co/BJm4JCoqfg

Joined June 2022

Don't wanna be here? Send us removal request.

Yike Wang @Neurips25

@yikewang_

6 months

LLMs are helpful for scientific research — but will they continuously be helpful? Introducing 🔍ScienceMeter: current knowledge update methods enable 86% preservation of prior scientific knowledge, 72% acquisition of new, and 38%+ projection of future ( https://t.co/zDjjl5GBaZ).

11

57

244

Ai2

@allen_ai

10 days

Announcing Olmo 3, a leading fully open LM suite built for reasoning, chat, & tool use, and an open model flow—not just the final weights, but the entire training journey. Best fully open 32B reasoning model & best 32B base model. 🧵

51

327

2K

Tong Chen @ NeurIPS

@tomchen0

17 days

OpenAI's blog ( https://t.co/Mu05PFfPXg) points out that today’s language models hallucinate because training and evaluation reward guessing instead of admitting uncertainty. This raises a natural question: can we reduce hallucination without hurting utility?🤔 On-policy RL with

25

123

669

Zhiyuan Zeng ✈️ NeurIPS 25🏖️

@ZhiyuanZeng_

19 days

RL is bounded by finite data😣? Introducing RLVE: RL with Adaptive Verifiable Environments We scale RL with data procedurally generated from 400 envs dynamically adapting to the trained model 💡find supervision signals right at the LM capability frontier + scale them 🔗in🧵

12

113

468

Shannon Shen

@shannonzshen

1 month

Today's AI agents are optimized to complete tasks in one shot. But real-world tasks are iterative, with evolving goals that need collaboration with users. We introduce collaborative effort scaling to evaluate how well agents work with people—not just complete tasks 🧵

6

52

266

Haoyi Qiu 🏄🏻‍♀️ NeurIPS

@HaoyiQiu

1 month

🤖💬AI agents can be easily persuaded (like Anthropic’s Claudius often giving discounts). 🤔Previous study on persuasion has been exclusively on text-only modality. We wonder: are AI agents more susceptible when presented with multimodal content? Introducing MMPersuade, a

11

26

130

smitha milli

@SmithaMilli

1 month

can we finally use natural language to optimize for deeper notions of what users want from their recommender systems?

4

13

54

Yike Wang @Neurips25

@yikewang_

1 month

“Responses are not monolithic: they switch across diverse skills which favor different model checkpoints in the training pipeline, thus we introduce model-guided collaborative inference to optimally use models with diverse skills for different segments of response generation.”

Shangbin Feng

@shangbinfeng

1 month

🔍Aligned LMs are better at reasoning/safety, but lose out on skills like calibration and generation diversity, where pretrained models are better. 🤝How about Don't Throw Away your Pretrained Model, and use multiple model stages of LLM training pipelines in collaboration!

0

3

26

Taylor Sorensen

@ma_tay_

2 months

🤖➡️📉 Post-training made LLMs better at chat and reasoning—but worse at distributional alignment, diversity, and sometimes even steering(!) We measure this with our new resource (Spectrum Suite) and introduce Spectrum Tuning (method) to bring them back into our models! 🌈 1/🧵

5

49

193

Shangbin Feng

@shangbinfeng

2 months

Introducing the Model Collaboration Tour 🤖🤝 Compositional intelligence. Collaborative development. Decentralized AI. By the Many. The methods. The vision. The hot takes. The comedy. LA folks, join us this week! Get your tickets by asking me to give a talk @ your lab/school!

4

19

34

Thinking Machines

@thinkymachines

3 months

Today Thinking Machines Lab is launching our research blog, Connectionism. Our first blog post is “Defeating Nondeterminism in LLM Inference” We believe that science is better when shared. Connectionism will cover topics as varied as our research is: from kernel numerics to

237

1K

8K

Yike Wang @Neurips25

@yikewang_

3 months

check out this great work! 👾

Zhiqiu (Oscar) Xu

@oscar_zhiqiu_xu

3 months

How do we navigate a growing collection of post-trained LLMs? In Delta Activations: A Representation for Finetuned LLMs, we propose a compact embedding that encodes the post-training signal. Try the interactive model navigator 👉 https://t.co/I7mKccXfzr

0

1

5

Lifan Yuan at neurips

@lifan__yuan

3 months

🧩New blog: From f(x) and g(x) to f(g(x)): LLMs Learn New Skills in RL by Composing Old Ones Do LLMs learn new skills through RL, or just activate existing patterns? Answer: RL teaches the powerful meta-skill of composition when properly incentivized. 🔗: https://t.co/4Ud8qsYrOT

13

92

428

Yuhan Liu

@YuhanLiu_nlp

3 months

👀Have you asked LLM to provide a more detailed answer after inspecting its initial output? Users often provide such implicit feedback during interaction. ✨We study implicit user feedback found in LMSYS and WildChat. #EMNLP2025

2

26

82

Yike Wang @Neurips25

@yikewang_

3 months

next chapter of particle swarm optimization: 🔖Data Swarms, evaluation data generation through co-evolution between data generators and models. check this out 🥂

Shangbin Feng

@shangbinfeng

3 months

👀 How to find more difficult/novel/salient evaluation data? ✨ Let the data generators find it for you! Introducing Data Swarms, multiple data generator LMs collaboratively search in the weight space to optimize quantitative desiderata of evaluation.

0

3

37

Shangbin Feng

@shangbinfeng

3 months

Two caveats with self-alignment: ⚠️ A single model struggles to reliably judge its own generation. ⚠️ A single model struggles to reliably generate diverse responses to learn from. 👉 Introducing Sparta Alignment, where multiple LMs collectively align through ⚔️ combat.

2

13

35

Ai2

@allen_ai

4 months

With fresh support of $75M from @NSF and $77M from @NVIDIA, we’re set to scale our open model ecosystem, bolster the infrastructure behind it, and fast‑track reproducible AI research to unlock the next wave of scientific discovery. 💡

36

80

751

Hailey Joren

@HaileyJoren

5 months

PhD in Computer Science, University of California San Diego 🎓 My research focused on uncertainty and safety in AI systems, including 🤷‍♀️letting models say "I don't know" under uncertainty 🔎understanding and reducing hallucinations 🔁 methods for answering "how much will

30

21

625

smitha milli

@SmithaMilli

5 months

Today we're releasing Community Alignment - the largest open-source dataset of human preferences for LLMs, containing ~200k comparisons from >3000 annotators in 5 countries / languages! There was a lot of research that went into this... 🧵

12

70

331

Junhao Chen

@Cumquaaa

5 months

🚀 Training an image generation model and picking sides between autoregressive (AR) and diffusion? Why not both? Check out MADFormer with half of the model layers for AR and half for diffusion. AR gives a fast guess for the next patch prediction while diffusion helps refine the

4

12

40

Scott Geng

@scottgeng00

5 months

🤔 How do we train AI models that surpass their teachers? 🚨 In #COLM2025: ✨Delta learning ✨makes LLM post-training cheap and easy – with only weak data, we beat open 8B SOTA 🤯 The secret? Learn from the *differences* in weak data pairs! 📜 https://t.co/dw1QeQackx 🧵 below

7

53

166