yikewang_ Profile Banner
Yike Wang @Neurips25 Profile
Yike Wang @Neurips25

@yikewang_

Followers
457
Following
226
Media
6
Statuses
97

PhD student @uwcse @uwnlp | BA, MS @berkeley_ai

Joined June 2022
Don't wanna be here? Send us removal request.
@yikewang_
Yike Wang @Neurips25
6 months
LLMs are helpful for scientific research — but will they continuously be helpful? Introducing 🔍ScienceMeter: current knowledge update methods enable 86% preservation of prior scientific knowledge, 72% acquisition of new, and 38%+ projection of future ( https://t.co/zDjjl5GBaZ).
11
57
244
@allen_ai
Ai2
10 days
Announcing Olmo 3, a leading fully open LM suite built for reasoning, chat, & tool use, and an open model flow—not just the final weights, but the entire training journey. Best fully open 32B reasoning model & best 32B base model. 🧵
51
327
2K
@tomchen0
Tong Chen @ NeurIPS
17 days
OpenAI's blog ( https://t.co/Mu05PFfPXg) points out that today’s language models hallucinate because training and evaluation reward guessing instead of admitting uncertainty. This raises a natural question: can we reduce hallucination without hurting utility?🤔 On-policy RL with
25
123
669
@ZhiyuanZeng_
Zhiyuan Zeng ✈️ NeurIPS 25🏖️
19 days
RL is bounded by finite data😣? Introducing RLVE: RL with Adaptive Verifiable Environments We scale RL with data procedurally generated from 400 envs dynamically adapting to the trained model 💡find supervision signals right at the LM capability frontier + scale them 🔗in🧵
12
113
468
@shannonzshen
Shannon Shen
1 month
Today's AI agents are optimized to complete tasks in one shot. But real-world tasks are iterative, with evolving goals that need collaboration with users. We introduce collaborative effort scaling to evaluate how well agents work with people—not just complete tasks 🧵
6
52
266
@HaoyiQiu
Haoyi Qiu 🏄🏻‍♀️ NeurIPS
1 month
🤖💬AI agents can be easily persuaded (like Anthropic’s Claudius often giving discounts). 🤔Previous study on persuasion has been exclusively on text-only modality. We wonder: are AI agents more susceptible when presented with multimodal content? Introducing MMPersuade, a
11
26
130
@SmithaMilli
smitha milli
1 month
can we finally use natural language to optimize for deeper notions of what users want from their recommender systems?
4
13
54
@yikewang_
Yike Wang @Neurips25
1 month
“Responses are not monolithic: they switch across diverse skills which favor different model checkpoints in the training pipeline, thus we introduce model-guided collaborative inference to optimally use models with diverse skills for different segments of response generation.”
@shangbinfeng
Shangbin Feng
1 month
🔍Aligned LMs are better at reasoning/safety, but lose out on skills like calibration and generation diversity, where pretrained models are better. 🤝How about Don't Throw Away your Pretrained Model, and use multiple model stages of LLM training pipelines in collaboration!
0
3
26
@ma_tay_
Taylor Sorensen
2 months
🤖➡️📉 Post-training made LLMs better at chat and reasoning—but worse at distributional alignment, diversity, and sometimes even steering(!) We measure this with our new resource (Spectrum Suite) and introduce Spectrum Tuning (method) to bring them back into our models! 🌈 1/🧵
5
49
193
@shangbinfeng
Shangbin Feng
2 months
Introducing the Model Collaboration Tour 🤖🤝 Compositional intelligence. Collaborative development. Decentralized AI. By the Many. The methods. The vision. The hot takes. The comedy. LA folks, join us this week! Get your tickets by asking me to give a talk @ your lab/school!
4
19
34
@thinkymachines
Thinking Machines
3 months
Today Thinking Machines Lab is launching our research blog, Connectionism. Our first blog post is “Defeating Nondeterminism in LLM Inference” We believe that science is better when shared. Connectionism will cover topics as varied as our research is: from kernel numerics to
237
1K
8K
@yikewang_
Yike Wang @Neurips25
3 months
check out this great work! 👾
@oscar_zhiqiu_xu
Zhiqiu (Oscar) Xu
3 months
How do we navigate a growing collection of post-trained LLMs? In Delta Activations: A Representation for Finetuned LLMs, we propose a compact embedding that encodes the post-training signal. Try the interactive model navigator 👉 https://t.co/I7mKccXfzr
0
1
5
@lifan__yuan
Lifan Yuan at neurips
3 months
🧩New blog: From f(x) and g(x) to f(g(x)): LLMs Learn New Skills in RL by Composing Old Ones Do LLMs learn new skills through RL, or just activate existing patterns? Answer: RL teaches the powerful meta-skill of composition when properly incentivized. 🔗: https://t.co/4Ud8qsYrOT
13
92
428
@YuhanLiu_nlp
Yuhan Liu
3 months
👀Have you asked LLM to provide a more detailed answer after inspecting its initial output? Users often provide such implicit feedback during interaction. ✨We study implicit user feedback found in LMSYS and WildChat. #EMNLP2025
2
26
82
@yikewang_
Yike Wang @Neurips25
3 months
next chapter of particle swarm optimization: 🔖Data Swarms, evaluation data generation through co-evolution between data generators and models. check this out 🥂
@shangbinfeng
Shangbin Feng
3 months
👀 How to find more difficult/novel/salient evaluation data? ✨ Let the data generators find it for you! Introducing Data Swarms, multiple data generator LMs collaboratively search in the weight space to optimize quantitative desiderata of evaluation.
0
3
37
@shangbinfeng
Shangbin Feng
3 months
Two caveats with self-alignment: ⚠️ A single model struggles to reliably judge its own generation. ⚠️ A single model struggles to reliably generate diverse responses to learn from. 👉 Introducing Sparta Alignment, where multiple LMs collectively align through ⚔️ combat.
2
13
35
@allen_ai
Ai2
4 months
With fresh support of $75M from @NSF and $77M from @NVIDIA, we’re set to scale our open model ecosystem, bolster the infrastructure behind it, and fast‑track reproducible AI research to unlock the next wave of scientific discovery. 💡
36
80
751
@HaileyJoren
Hailey Joren
5 months
PhD in Computer Science, University of California San Diego 🎓 My research focused on uncertainty and safety in AI systems, including 🤷‍♀️letting models say "I don't know" under uncertainty 🔎understanding and reducing hallucinations 🔁 methods for answering "how much will
30
21
625
@SmithaMilli
smitha milli
5 months
Today we're releasing Community Alignment - the largest open-source dataset of human preferences for LLMs, containing ~200k comparisons from >3000 annotators in 5 countries / languages! There was a lot of research that went into this... 🧵
12
70
331
@Cumquaaa
Junhao Chen
5 months
🚀 Training an image generation model and picking sides between autoregressive (AR) and diffusion? Why not both? Check out MADFormer with half of the model layers for AR and half for diffusion. AR gives a fast guess for the next patch prediction while diffusion helps refine the
4
12
40
@scottgeng00
Scott Geng
5 months
🤔 How do we train AI models that surpass their teachers? 🚨 In #COLM2025: ✨Delta learning ✨makes LLM post-training cheap and easy – with only weak data, we beat open 8B SOTA 🤯 The secret? Learn from the *differences* in weak data pairs! 📜 https://t.co/dw1QeQackx 🧵 below
7
53
166