
Rulin Shao
@RulinShao
Followers
4K
Following
921
Media
25
Statuses
298
PhD @UWNLP, visiting researcher @Meta.
Joined April 2022
RT @VictoriaWGraf: Worried about overfitting to IFEval? 🤔 Use ✨IFBench✨ our new, challenging instruction-following benchmark!. Loved workin….
0
10
0
RT @Benjamin_eecs: We've always been excited about self-play unlocking continuously improving agents. Our insight: RL selects generalizable….
0
48
0
RT @ChengleiSi: Are AI scientists already better than human researchers?. We recruited 43 PhD students to spend 3 months executing research….
0
160
0
RT @thao_nguyen26: Web data, the “fossil fuel of AI”, is being exhausted. What’s next?🤔.We propose Recycling the Web to break the data wall….
0
59
0
It reminds me of the cognitive behaviors that have been found to help reasoning—backtracking, subgoal setting, verifications, etc.—they all seem to fit this parallel generation pattern better than linearly chaining them. Looking forward to trying it out!.
🔥 We introduce Multiverse, a new generative modeling framework for adaptive and lossless parallel generation. 🚀 Multiverse is the first open-source non-AR model to achieve AIME24 and AIME25 scores of 54% and 46%. 🌐 Website: 🧵 1/n
0
0
33
Honored to be part of organizing the LM4Sci workshop at #COLM2025! 🔬🤖 We invite submissions that demonstrate innovative approaches to scientific reasoning and discovery. Submit by June 23! 🚀.
🚨 Call for Papers: LM4Sci @COLM_conf 2025 🚨. Excited to announce the Large Language Modeling for Scientific Discovery (LM4Sci) workshop at COLM 2025 in Montreal, Canada!. Submission Deadline: June 23.Notification: July 24.Workshop: October 10, 2025
0
6
30
RT @ziruirayliu: 🔥Exited to share our new work on reproducibility challenges in reasoning models caused by numerical precision. Ever run t….
0
22
0
🎉Our Spurious Rewards is available on ArXiv! We added experiments on.- More prompts/steps/models/analysis. - Spurious Prompts!.Surprisingly, we obtained 19.4% gains when replacing prompts with LaTex placeholder text (\lipsum) 😶🌫️. Check out our 2nd blog:
🤯 We cracked RLVR with. Random Rewards?!.Training Qwen2.5-Math-7B with our Spurious Rewards improved MATH-500 by:.- Random rewards: +21%.- Incorrect rewards: +25%.- (FYI) Ground-truth rewards: + 28.8%.How could this even work⁉️ Here's why: 🧵.Blogpost:
4
40
219
RT @HanGuo97: One key takeaway from recent work on test-time compute: even a small weight update can make a big difference. So, what happen….
0
8
0
Qwen3-0.6B x Wikipedia datastore is now supported in massive-serve! Serve a local API in one line:. massive-serve serve --domain --domain_name dpr_wiki_qwen3_0.6b_ivfpq. Use `dpr_wiki_qwen3_0.6b` for flat index. Examples of sending single/batch queries:
🚀 Proud to introduce the Qwen3-Embedding and Qwen3-Reranker Series – setting new standards in multilingual text embedding and relevance ranking!. ✨ Highlights:.✅ Available in 0.6B / 4B / 8B versions.✅ Supports 119 languages.✅ State-of-the-Art performance on MMTEB , MTEB ,
2
7
62
RT @HanGuo97: We know Attention and its linear-time variants, such as linear attention and State Space Models. But what lies in between?. I….
0
191
0
RT @jxmnop: new paper from our work at Meta!. **GPT-style language models memorize 3.6 bits per param**. we compute capacity by measuring t….
0
385
0
RT @lschmidt3: Very excited to finally release our paper for OpenThoughts!. After DataComp and DCLM, this is the third large open dataset m….
0
213
0
RT @HannaHajishirzi: Check out who the 2025 ACM Dissertation Award honorees are this year — our very own @sewon__min and @sharma_ashish_2!….
0
4
0
RT @AkariAsai: ‘Bold,’ ‘positive’ and ‘unparalleled’: Allen School Ph.D. graduates Ashish Sharma and Sewon Min recognized with ACM Doctoral….
0
16
0
RT @StellaLisy: Excited to share more about Spurious Rewards! Also keep an eye out for some new experiments and arxiv coming soon 👀🔜.
0
11
0
RT @jihan_yao: We introduce MMMG: a Comprehensive and Reliable Evaluation Suite for Multitask Multimodal Generation. ✅ Reliable: 94.3% agre….
0
17
0