
Jason Liu
@JasonLiu106968
Followers
74
Following
8
Media
4
Statuses
13
Joined August 2025
Excited to share our #RL_for_LLM paper: "Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning" . We conducted a comprehensive analysis of RL techniques in LLM domain!🥳 .Surprisingly, we found that using only 2 techniques can unlock the learning capability of LLMs.😮
7
27
153
This thread presents a remarkably impressive and in-depth explanation of an interesting finding in our "tricks or traps" paper — why group-level and batch-level norms show differences under various reward scales. Learn a lot from two experts!.
RL community query: Why would reward scale make batch vs group advantage normalization behave so differently in RL for LLM reasoning? 🧐🔍. I'm reading “Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning” (arXiv:2508.08221v1). They report big swings when switching.
1
0
0
Final remark:.Full Paper: [. We aim to provide clear, practical guidance on choosing RL techniques—call it "Deep RL that Matters" in the era of large models. ROLL team is continuously improving our frameworks to better serve the RL4LLM community.💪.
arxiv.org
Reinforcement learning for LLM reasoning has rapidly emerged as a prominent research area, marked by a significant surge in related studies on both algorithmic innovations and practical...
0
2
15