RulinShao Profile Banner
Rulin Shao Profile
Rulin Shao

@RulinShao

Followers
4K
Following
921
Media
25
Statuses
298

PhD @UWNLP, visiting researcher @Meta.

Joined April 2022
Don't wanna be here? Send us removal request.
@RulinShao
Rulin Shao
2 months
Meet ReasonIR-8B✨the first retriever specifically trained for reasoning tasks! Our challenging synthetic training data unlocks SOTA scores on reasoning IR and RAG benchmarks. ReasonIR-8B ranks 1st on BRIGHT and outperforms search engine and retriever baselines on MMLU and GPQA🔥
Tweet media one
5
62
342
@RulinShao
Rulin Shao
18 hours
RT @VictoriaWGraf: Worried about overfitting to IFEval? 🤔 Use ✨IFBench✨ our new, challenging instruction-following benchmark!. Loved workin….
0
10
0
@RulinShao
Rulin Shao
2 days
RT @qi2peng2: Seven years ago, I co-led a paper called 𝗛𝗼𝘁𝗽𝗼𝘁𝗤𝗔 that has motivated and facilitated many #AI #Agents research works since. T….
0
43
0
@RulinShao
Rulin Shao
2 days
RT @Benjamin_eecs: We've always been excited about self-play unlocking continuously improving agents. Our insight: RL selects generalizable….
0
48
0
@RulinShao
Rulin Shao
3 days
RT @ChengleiSi: Are AI scientists already better than human researchers?. We recruited 43 PhD students to spend 3 months executing research….
0
160
0
@RulinShao
Rulin Shao
10 days
RT @thao_nguyen26: Web data, the “fossil fuel of AI”, is being exhausted. What’s next?🤔.We propose Recycling the Web to break the data wall….
0
59
0
@RulinShao
Rulin Shao
18 days
It reminds me of the cognitive behaviors that have been found to help reasoning—backtracking, subgoal setting, verifications, etc.—they all seem to fit this parallel generation pattern better than linearly chaining them. Looking forward to trying it out!.
@InfiniAILab
Infini-AI-Lab
18 days
🔥 We introduce Multiverse, a new generative modeling framework for adaptive and lossless parallel generation. 🚀 Multiverse is the first open-source non-AR model to achieve AIME24 and AIME25 scores of 54% and 46%. 🌐 Website: 🧵 1/n
0
0
33
@RulinShao
Rulin Shao
19 days
Honored to be part of organizing the LM4Sci workshop at #COLM2025! 🔬🤖 We invite submissions that demonstrate innovative approaches to scientific reasoning and discovery. Submit by June 23! 🚀.
@lm4sci
LM4SCI @ COLM2025
21 days
🚨 Call for Papers: LM4Sci @COLM_conf 2025 🚨. Excited to announce the Large Language Modeling for Scientific Discovery (LM4Sci) workshop at COLM 2025 in Montreal, Canada!. Submission Deadline: June 23.Notification: July 24.Workshop: October 10, 2025
Tweet media one
0
6
30
@RulinShao
Rulin Shao
21 days
RT @ziruirayliu: 🔥Exited to share our new work on reproducibility challenges in reasoning models caused by numerical precision. Ever run t….
0
22
0
@RulinShao
Rulin Shao
21 days
Arxiv: Clearly a lot more work is needed to understand what’s really happening with RL and prompting. We hope that our experiments with spurious rewards and spurious prompts, as well as the released code, data, checkpoints, etc. will help with this! 🔍.
1
1
15
@RulinShao
Rulin Shao
21 days
🎉Our Spurious Rewards is available on ArXiv! We added experiments on.- More prompts/steps/models/analysis. - Spurious Prompts!.Surprisingly, we obtained 19.4% gains when replacing prompts with LaTex placeholder text (\lipsum) 😶‍🌫️. Check out our 2nd blog:
Tweet media one
@StellaLisy
Stella Li
1 month
🤯 We cracked RLVR with. Random Rewards?!.Training Qwen2.5-Math-7B with our Spurious Rewards improved MATH-500 by:.- Random rewards: +21%.- Incorrect rewards: +25%.- (FYI) Ground-truth rewards: + 28.8%.How could this even work⁉️ Here's why: 🧵.Blogpost:
Tweet media one
4
40
219
@RulinShao
Rulin Shao
22 days
RT @HanGuo97: One key takeaway from recent work on test-time compute: even a small weight update can make a big difference. So, what happen….
0
8
0
@RulinShao
Rulin Shao
28 days
Tip: remember to install massive-serve first: . pip install -U massive-serve. The first-time run is slow because it downloads the datastore and the model from HF. You may need to update transformers version to support Qwen. 😃.
0
0
6
@RulinShao
Rulin Shao
28 days
Qwen3-0.6B x Wikipedia datastore is now supported in massive-serve! Serve a local API in one line:. massive-serve serve --domain --domain_name dpr_wiki_qwen3_0.6b_ivfpq. Use `dpr_wiki_qwen3_0.6b` for flat index. Examples of sending single/batch queries:
@Alibaba_Qwen
Qwen
29 days
🚀 Proud to introduce the Qwen3-Embedding and Qwen3-Reranker Series – setting new standards in multilingual text embedding and relevance ranking!. ✨ Highlights:.✅ Available in 0.6B / 4B / 8B versions.✅ Supports 119 languages.✅ State-of-the-Art performance on MMTEB , MTEB ,
Tweet media one
2
7
62
@RulinShao
Rulin Shao
29 days
RT @HanGuo97: We know Attention and its linear-time variants, such as linear attention and State Space Models. But what lies in between?. I….
0
191
0
@RulinShao
Rulin Shao
29 days
RT @jxmnop: new paper from our work at Meta!. **GPT-style language models memorize 3.6 bits per param**. we compute capacity by measuring t….
0
385
0
@RulinShao
Rulin Shao
29 days
RT @lschmidt3: Very excited to finally release our paper for OpenThoughts!. After DataComp and DCLM, this is the third large open dataset m….
0
213
0
@RulinShao
Rulin Shao
30 days
RT @HannaHajishirzi: Check out who the 2025 ACM Dissertation Award honorees are this year — our very own @sewon__min and @sharma_ashish_2!….
0
4
0
@RulinShao
Rulin Shao
30 days
RT @AkariAsai: ‘Bold,’ ‘positive’ and ‘unparalleled’: Allen School Ph.D. graduates Ashish Sharma and Sewon Min recognized with ACM Doctoral….
0
16
0
@RulinShao
Rulin Shao
30 days
RT @StellaLisy: Excited to share more about Spurious Rewards! Also keep an eye out for some new experiments and arxiv coming soon 👀🔜.
0
11
0
@RulinShao
Rulin Shao
1 month
RT @jihan_yao: We introduce MMMG: a Comprehensive and Reliable Evaluation Suite for Multitask Multimodal Generation. ✅ Reliable: 94.3% agre….
0
17
0