Rulin Shao @RulinShao X Profile

Rulin Shao

@RulinShao

Followers

4K

Following

921

Media

25

Statuses

298

PhD @UWNLP, visiting researcher @Meta.

Joined April 2022

Don't wanna be here? Send us removal request.

Rulin Shao

@RulinShao

2 months

Meet ReasonIR-8B✨the first retriever specifically trained for reasoning tasks! Our challenging synthetic training data unlocks SOTA scores on reasoning IR and RAG benchmarks. ReasonIR-8B ranks 1st on BRIGHT and outperforms search engine and retriever baselines on MMLU and GPQA🔥

5

62

342

Rulin Shao

@RulinShao

18 hours

RT @VictoriaWGraf: Worried about overfitting to IFEval? 🤔 Use ✨IFBench✨ our new, challenging instruction-following benchmark!. Loved workin….

0

10

0

Rulin Shao

@RulinShao

2 days

RT @qi2peng2: Seven years ago, I co-led a paper called 𝗛𝗼𝘁𝗽𝗼𝘁𝗤𝗔 that has motivated and facilitated many #AI #Agents research works since. T….

0

43

0

Rulin Shao

@RulinShao

2 days

RT @Benjamin_eecs: We've always been excited about self-play unlocking continuously improving agents. Our insight: RL selects generalizable….

0

48

0

Rulin Shao

@RulinShao

3 days

RT @ChengleiSi: Are AI scientists already better than human researchers?. We recruited 43 PhD students to spend 3 months executing research….

0

160

0

Rulin Shao

@RulinShao

10 days

RT @thao_nguyen26: Web data, the “fossil fuel of AI”, is being exhausted. What’s next?🤔.We propose Recycling the Web to break the data wall….

0

59

0

Rulin Shao

@RulinShao

18 days

It reminds me of the cognitive behaviors that have been found to help reasoning—backtracking, subgoal setting, verifications, etc.—they all seem to fit this parallel generation pattern better than linearly chaining them. Looking forward to trying it out!.

Infini-AI-Lab

@InfiniAILab

18 days

🔥 We introduce Multiverse, a new generative modeling framework for adaptive and lossless parallel generation. 🚀 Multiverse is the first open-source non-AR model to achieve AIME24 and AIME25 scores of 54% and 46%. 🌐 Website: 🧵 1/n

0

33

Rulin Shao

@RulinShao

19 days

Honored to be part of organizing the LM4Sci workshop at #COLM2025! 🔬🤖 We invite submissions that demonstrate innovative approaches to scientific reasoning and discovery. Submit by June 23! 🚀.

LM4SCI @ COLM2025

@lm4sci

21 days

🚨 Call for Papers: LM4Sci @COLM_conf 2025 🚨. Excited to announce the Large Language Modeling for Scientific Discovery (LM4Sci) workshop at COLM 2025 in Montreal, Canada!. Submission Deadline: June 23.Notification: July 24.Workshop: October 10, 2025

0

6

30

Rulin Shao

@RulinShao

21 days

RT @ziruirayliu: 🔥Exited to share our new work on reproducibility challenges in reasoning models caused by numerical precision. Ever run t….

0

22

0

Rulin Shao

@RulinShao

21 days

Arxiv: Clearly a lot more work is needed to understand what’s really happening with RL and prompting. We hope that our experiments with spurious rewards and spurious prompts, as well as the released code, data, checkpoints, etc. will help with this! 🔍.

1

15

Rulin Shao

@RulinShao

21 days

🎉Our Spurious Rewards is available on ArXiv! We added experiments on.- More prompts/steps/models/analysis. - Spurious Prompts!.Surprisingly, we obtained 19.4% gains when replacing prompts with LaTex placeholder text (\lipsum) 😶‍🌫️. Check out our 2nd blog:

Stella Li

@StellaLisy

1 month

🤯 We cracked RLVR with. Random Rewards?!.Training Qwen2.5-Math-7B with our Spurious Rewards improved MATH-500 by:.- Random rewards: +21%.- Incorrect rewards: +25%.- (FYI) Ground-truth rewards: + 28.8%.How could this even work⁉️ Here's why: 🧵.Blogpost:

4

40

219

Rulin Shao

@RulinShao

22 days

RT @HanGuo97: One key takeaway from recent work on test-time compute: even a small weight update can make a big difference. So, what happen….

0

8

0

Rulin Shao

@RulinShao

28 days

Tip: remember to install massive-serve first: . pip install -U massive-serve. The first-time run is slow because it downloads the datastore and the model from HF. You may need to update transformers version to support Qwen. 😃.

0

6

Rulin Shao

@RulinShao

28 days

Qwen3-0.6B x Wikipedia datastore is now supported in massive-serve! Serve a local API in one line:. massive-serve serve --domain --domain_name dpr_wiki_qwen3_0.6b_ivfpq. Use `dpr_wiki_qwen3_0.6b` for flat index. Examples of sending single/batch queries:

Qwen

@Alibaba_Qwen

29 days

🚀 Proud to introduce the Qwen3-Embedding and Qwen3-Reranker Series – setting new standards in multilingual text embedding and relevance ranking!. ✨ Highlights:.✅ Available in 0.6B / 4B / 8B versions.✅ Supports 119 languages.✅ State-of-the-Art performance on MMTEB , MTEB ,

2

7

62

Rulin Shao

@RulinShao

29 days

RT @HanGuo97: We know Attention and its linear-time variants, such as linear attention and State Space Models. But what lies in between?. I….

0

191

0

Rulin Shao

@RulinShao

29 days

RT @jxmnop: new paper from our work at Meta!. **GPT-style language models memorize 3.6 bits per param**. we compute capacity by measuring t….

0

385

0

Rulin Shao

@RulinShao

29 days

RT @lschmidt3: Very excited to finally release our paper for OpenThoughts!. After DataComp and DCLM, this is the third large open dataset m….

0

213

0

Rulin Shao

@RulinShao

30 days

RT @HannaHajishirzi: Check out who the 2025 ACM Dissertation Award honorees are this year — our very own @sewon__min and @sharma_ashish_2!….

0

4

0

Rulin Shao

@RulinShao

30 days

RT @AkariAsai: ‘Bold,’ ‘positive’ and ‘unparalleled’: Allen School Ph.D. graduates Ashish Sharma and Sewon Min recognized with ACM Doctoral….

0

16

0

Rulin Shao

@RulinShao

30 days

RT @StellaLisy: Excited to share more about Spurious Rewards! Also keep an eye out for some new experiments and arxiv coming soon 👀🔜.

0

11

0

Rulin Shao

@RulinShao

1 month

RT @jihan_yao: We introduce MMMG: a Comprehensive and Reliable Evaluation Suite for Multitask Multimodal Generation. ✅ Reliable: 94.3% agre….

0

17

0