Zengzhi Wang @SinclairWang1 X Profile

Zengzhi Wang

@SinclairWang1

Followers

2K

Following

7K

Media

97

Statuses

2K

PhDing @sjtu1896 #NLProc Working on Data Engineering for LLMs: MathPile (2023), 🫐 ProX (2024), 💎 MegaMath (2025)，🐙 OctoThinker（2025）

Joined November 2020

Don't wanna be here? Send us removal request.

Zengzhi Wang

@SinclairWang1

11 days

What Makes a Base Language Model Suitable for RL?. Rumors in the community say RL (i.e., RLVR) on LLMs is full of “mysteries”:. (1) Is the magic only happening on Qwen + Math?.(2) Does the "aha moment" only spark during math reasoning?.(3) Is evaluation hiding some tricky traps?

10

90

504

Zengzhi Wang

@SinclairWang1

1 day

Can not agree more!.

Hynek Kydlíček

@HKydlicek

2 days

I don't think we need an American DeepSeek Project, we need an Open-Data DeepSeek. And no we didn't get one yet, despite what you might think, so let me explain. The biggest contributor to the gap between closed-source and open-source AI is, in my opinion, data accessibility and

0

3

Zengzhi Wang

@SinclairWang1

1 day

RT @dirctd_by_beens: blog - read 'octothinker' last week and it's so cool. great work by @SinclairWang1 @FaZhou_99….

0

4

0

Zengzhi Wang

@SinclairWang1

2 days

RT @soldni: @EMostaque @natolambert processing all of CommonCrawl is about $20-50k [0], plus maybe 10-50k H100 if you wanna do GPU classifi….

0

1

0

Zengzhi Wang

@SinclairWang1

4 days

RT @sivil_taram: Training end-to-end multi-turn tool-use agents has proven incredibly challenging 😤 Just as noted in the recent Kevin blog:….

0

8

0

Zengzhi Wang

@SinclairWang1

5 days

1. Solid data engineering on multimodal data. 2. Insightful details on the RL part, including but not limited to the design of answer extraction and reward system, the utilization of Curriculum Sampling, and details on improving effectiveness and stability.

0

9

66

Zengzhi Wang

@SinclairWang1

5 days

RT @aaron_defazio: Why do gradients increase near the end of training? .Read the paper to find out!.We also propose a simple fix to AdamW t….

0

74

0

Zengzhi Wang

@SinclairWang1

7 days

Just finished reading it quickly. It was truly impressive.

3

19

230

Zengzhi Wang

@SinclairWang1

9 days

RT @alphabatcher: @SinclairWang1 yeah, mid-training weight carrying is a huge factor.

0

1

0

Zengzhi Wang

@SinclairWang1

9 days

RT @code_star: Amazing work (once again). Better midtraining makes models better for RL. Once again the power of good data strikes again.….

0

3

0

Zengzhi Wang

@SinclairWang1

9 days

RT @stefan_fee: What foundation models do we REALLY need for the RL era? And what pre-training data?. Excited to share our work: OctoThinke….

0

9

0

Zengzhi Wang

@SinclairWang1

9 days

RT @nlpxuhui: Really appreciate this kind of work! Kudos to the team.

0

1

0

Zengzhi Wang

@SinclairWang1

9 days

RT @gneubig: @SinclairWang1 Great work! This sort of systematic analysis is quite important.

0

1

0

Zengzhi Wang

@SinclairWang1

10 days

RT @gui_penedo: We have finally released the 📝paper for 🥂FineWeb2, our large multilingual pre-training dataset. Along with general (and ex….

0

93

0

Zengzhi Wang

@SinclairWang1

10 days

I believe our work gives a preliminary definition for mid-training. Feel free to cite it along with these listed references.

Zengzhi Wang

@SinclairWang1

11 days

What Makes a Base Language Model Suitable for RL?. Rumors in the community say RL (i.e., RLVR) on LLMs is full of “mysteries”:. (1) Is the magic only happening on Qwen + Math?.(2) Does the "aha moment" only spark during math reasoning?.(3) Is evaluation hiding some tricky traps?

1

7

69

Zengzhi Wang

@SinclairWang1

10 days

I believe that these points are well said, well covering the conclusions and observations from maaaaaay recent papers.😃. (1) Is the magic only happening on Qwen + Math?.(2) Does the "aha moment" only spark during math reasoning?.(3) Is evaluation hiding some tricky traps?.(4) Is

0

13

Zengzhi Wang

@SinclairWang1

10 days

Just ready, feel free to download MegaMath-Web-Pro-Max right now!.

Zengzhi Wang

@SinclairWang1

11 days

Say hi to 🔮MegaMath-Pro-Max. High-quality corpora are vital for mid-training. When it comes to the math domain?. Let me tell you the behind recipe. 1. Curating Pipeline. Step 1: uniformly and randomly sample millions of documents from the MegaMath-Web corpus, stratified by

0

1

10

Zengzhi Wang

@SinclairWang1

10 days

Just ready, please download the data right now!.

Zengzhi Wang

@SinclairWang1

11 days

What Makes a Base Language Model Suitable for RL?. Rumors in the community say RL (i.e., RLVR) on LLMs is full of “mysteries”:. (1) Is the magic only happening on Qwen + Math?.(2) Does the "aha moment" only spark during math reasoning?.(3) Is evaluation hiding some tricky traps?

1

5

50

Zengzhi Wang

@SinclairWang1

10 days

RT @joemelko: More evidence of the importance of high quality mid(pre) training data to create a base for rl. Cool paper.

0

2

0

Zengzhi Wang

@SinclairWang1

10 days

RT @MichelIvan92347: Details on the new corpora for mid-training* 👇. *The OctoThinker paper is worth reading as already stated here.

0

1

0