SinclairWang1 Profile Banner
Zengzhi Wang Profile
Zengzhi Wang

@SinclairWang1

Followers
2K
Following
7K
Media
97
Statuses
2K

PhDing @sjtu1896 #NLProc Working on Data Engineering for LLMs: MathPile (2023), 🫐 ProX (2024), 💎 MegaMath (2025),🐙 OctoThinker(2025)

Joined November 2020
Don't wanna be here? Send us removal request.
@SinclairWang1
Zengzhi Wang
11 days
What Makes a Base Language Model Suitable for RL?. Rumors in the community say RL (i.e., RLVR) on LLMs is full of “mysteries”:. (1) Is the magic only happening on Qwen + Math?.(2) Does the "aha moment" only spark during math reasoning?.(3) Is evaluation hiding some tricky traps?
Tweet media one
Tweet media two
Tweet media three
Tweet media four
10
90
504
@SinclairWang1
Zengzhi Wang
1 day
Can not agree more!.
@HKydlicek
Hynek Kydlíček
2 days
I don't think we need an American DeepSeek Project, we need an Open-Data DeepSeek. And no we didn't get one yet, despite what you might think, so let me explain. The biggest contributor to the gap between closed-source and open-source AI is, in my opinion, data accessibility and
Tweet media one
0
0
3
@SinclairWang1
Zengzhi Wang
1 day
RT @dirctd_by_beens: blog - read 'octothinker' last week and it's so cool. great work by @SinclairWang1 @FaZhou_99….
0
4
0
@SinclairWang1
Zengzhi Wang
2 days
RT @soldni: @EMostaque @natolambert processing all of CommonCrawl is about $20-50k [0], plus maybe 10-50k H100 if you wanna do GPU classifi….
0
1
0
@SinclairWang1
Zengzhi Wang
4 days
RT @sivil_taram: Training end-to-end multi-turn tool-use agents has proven incredibly challenging 😤 Just as noted in the recent Kevin blog:….
0
8
0
@SinclairWang1
Zengzhi Wang
5 days
1. Solid data engineering on multimodal data. 2. Insightful details on the RL part, including but not limited to the design of answer extraction and reward system, the utilization of Curriculum Sampling, and details on improving effectiveness and stability.
Tweet media one
0
9
66
@SinclairWang1
Zengzhi Wang
5 days
RT @aaron_defazio: Why do gradients increase near the end of training? .Read the paper to find out!.We also propose a simple fix to AdamW t….
0
74
0
@SinclairWang1
Zengzhi Wang
7 days
Just finished reading it quickly. It was truly impressive.
Tweet media one
3
19
230
@SinclairWang1
Zengzhi Wang
9 days
RT @alphabatcher: @SinclairWang1 yeah, mid-training weight carrying is a huge factor.
0
1
0
@SinclairWang1
Zengzhi Wang
9 days
RT @code_star: Amazing work (once again). Better midtraining makes models better for RL. Once again the power of good data strikes again.….
0
3
0
@SinclairWang1
Zengzhi Wang
9 days
RT @stefan_fee: What foundation models do we REALLY need for the RL era? And what pre-training data?. Excited to share our work: OctoThinke….
0
9
0
@SinclairWang1
Zengzhi Wang
9 days
RT @nlpxuhui: Really appreciate this kind of work! Kudos to the team.
0
1
0
@SinclairWang1
Zengzhi Wang
9 days
RT @gneubig: @SinclairWang1 Great work! This sort of systematic analysis is quite important.
0
1
0
@SinclairWang1
Zengzhi Wang
10 days
RT @gui_penedo: We have finally released the 📝paper for 🥂FineWeb2, our large multilingual pre-training dataset. Along with general (and ex….
0
93
0
@SinclairWang1
Zengzhi Wang
10 days
I believe our work gives a preliminary definition for mid-training. Feel free to cite it along with these listed references.
Tweet media one
@SinclairWang1
Zengzhi Wang
11 days
What Makes a Base Language Model Suitable for RL?. Rumors in the community say RL (i.e., RLVR) on LLMs is full of “mysteries”:. (1) Is the magic only happening on Qwen + Math?.(2) Does the "aha moment" only spark during math reasoning?.(3) Is evaluation hiding some tricky traps?
Tweet media one
Tweet media two
Tweet media three
Tweet media four
1
7
69
@SinclairWang1
Zengzhi Wang
10 days
I believe that these points are well said, well covering the conclusions and observations from maaaaaay recent papers.😃. (1) Is the magic only happening on Qwen + Math?.(2) Does the "aha moment" only spark during math reasoning?.(3) Is evaluation hiding some tricky traps?.(4) Is
Tweet media one
0
0
13
@SinclairWang1
Zengzhi Wang
10 days
Just ready, feel free to download MegaMath-Web-Pro-Max right now!.
@SinclairWang1
Zengzhi Wang
11 days
Say hi to 🔮MegaMath-Pro-Max. High-quality corpora are vital for mid-training. When it comes to the math domain?. Let me tell you the behind recipe. 1. Curating Pipeline. Step 1: uniformly and randomly sample millions of documents from the MegaMath-Web corpus, stratified by
Tweet media one
Tweet media two
Tweet media three
Tweet media four
0
1
10
@SinclairWang1
Zengzhi Wang
10 days
Just ready, please download the data right now!.
Tweet media one
@SinclairWang1
Zengzhi Wang
11 days
What Makes a Base Language Model Suitable for RL?. Rumors in the community say RL (i.e., RLVR) on LLMs is full of “mysteries”:. (1) Is the magic only happening on Qwen + Math?.(2) Does the "aha moment" only spark during math reasoning?.(3) Is evaluation hiding some tricky traps?
Tweet media one
Tweet media two
Tweet media three
Tweet media four
1
5
50
@SinclairWang1
Zengzhi Wang
10 days
RT @joemelko: More evidence of the importance of high quality mid(pre) training data to create a base for rl. Cool paper.
0
2
0
@SinclairWang1
Zengzhi Wang
10 days
RT @MichelIvan92347: Details on the new corpora for mid-training* 👇. *The OctoThinker paper is worth reading as already stated here.
0
1
0