_weiping Profile Banner
Wei Ping Profile
Wei Ping

@_weiping

Followers
2K
Following
599
Media
14
Statuses
298

Distinguished research scientist @nvidia | LLM post-training, reasoning, multimodality | generative models for audio. Views are my own.

San Francisco, CA
Joined June 2020
Don't wanna be here? Send us removal request.
@_weiping
Wei Ping
2 months
Introducing AceReason-Nemotron: Advancing math and code reasoning through reinforcement learning (RL). We propose conducting RL on math-only prompts first, then on code-only prompts. Our key findings include:.- Math-only RL significantly boosts both math and code benchmarks!.-
Tweet media one
Tweet media two
2
24
151
@_weiping
Wei Ping
27 days
RT @lucas110550: Our released evaluation toolkit can reproduce our AceReason-Nemotron models numbers (see below):. AceReason-Nemotron-1.0-7….
0
4
0
@_weiping
Wei Ping
27 days
RT @ychenNLP: 📢We conduct a systematic study to demystify the synergy between SFT and RL for reasoning models. The result? We trained a 7B….
0
43
0
@_weiping
Wei Ping
27 days
RT @zihan_johan_liu: With stronger SFT backbone, AceReason-Nemotron-1.1-7B significantly outperforms its predecessor and sets a record-high….
0
8
0
@_weiping
Wei Ping
27 days
RT @MohammadShoeybi: Checkout our detailed study on advancing math and code reasoning using SFT and RL.
0
3
0
@_weiping
Wei Ping
28 days
Introducing AceReason-Nemotron 1.1. Our previous release, AceReason-Nemotron-1.0, introduced a stage-wise RL recipe that was applied sequentially to math-only and code-only prompts, demonstrating both high efficiency and strong effectiveness. Here, we systematically investigate
Tweet media one
1
16
68
@_weiping
Wei Ping
1 month
RT @GavinNewsom: If they can handcuff a U.S. Senator for asking a question, imagine what they will do to you.
Tweet media one
0
63K
0
@_weiping
Wei Ping
1 month
RT @mli0603: Cosmos-Reason1 has exciting updates 💡.Now it understands physical reality — judging videos as real or fake! Check out the reso….
0
32
0
@_weiping
Wei Ping
1 month
RT @kuchaev: New reasoning Nemotron-H models are now publicly available. These models are based on hybrid architecture! .47B and 8B in BF….
0
25
0
@_weiping
Wei Ping
1 month
Pass@1024 results of our RL model (AceReason-Nemotron-7B) and its starting SFT model (DeepSeek-R1-Distill-Qwen-7B) on LiveCodeBench-v6, which features a large answer space and high-quality test cases that are difficult to solve through 'guessing', even with extensive sampling.
Tweet media one
@_weiping
Wei Ping
2 months
Introducing AceReason-Nemotron: Advancing math and code reasoning through reinforcement learning (RL). We propose conducting RL on math-only prompts first, then on code-only prompts. Our key findings include:.- Math-only RL significantly boosts both math and code benchmarks!.-
Tweet media one
Tweet media two
2
9
54
@_weiping
Wei Ping
2 months
👍👍.
@deepseek_ai
DeepSeek
2 months
🚀 DeepSeek-R1-0528 is here!. 🔹 Improved benchmark performance.🔹 Enhanced front-end capabilities.🔹 Reduced hallucinations.🔹 Supports JSON output & function calling. ✅ Try it now: 🔌 No change to API usage — docs here: 🔗
Tweet media one
Tweet media two
0
0
1
@_weiping
Wei Ping
2 months
RT @_akhaliq: Nvidia just dropped AceReason-Nemotron on Hugging Face. Advancing Math and Code Reasoning through Reinforcement Learning http….
0
34
0
@_weiping
Wei Ping
2 months
RT @zihan_johan_liu: Check out our AceReason-Nemotron-14B. 🤗We start with RL training using math-only prompts, the….
0
2
0
@_weiping
Wei Ping
2 months
RT @StringChaos: Lots of good analysis in here!.
0
3
0
@_weiping
Wei Ping
2 months
RT @ychenNLP: with just math-RL, AceReason-Nemotron-14B surpass DeepCoder-14B on LiveCodeBench v5. we then did code-RL and found training….
0
9
0
@_weiping
Wei Ping
2 months
RT @NVIDIAAIDev: 📣 Introducing AceReason-Nemotron: Advancing Math and Code Reasoning through Reinforcement Learning (RL). Starting from the….
0
36
0
@_weiping
Wei Ping
2 months
RT @kuchaev: Llama-Nemotron-v1 technical report is now available on arxiv
Tweet media one
0
65
0
@_weiping
Wei Ping
2 months
RT @Alibaba_Qwen: Introducing Qwen3! . We release and open-weight Qwen3, our latest large language models, including 2 MoE models and 6 den….
0
2K
0
@_weiping
Wei Ping
3 months
RT @zihan_johan_liu: Introducing AceMath-RL-Nemotron-7B, a math reasoning model trained entirely through reinforcement learning from DeepSe….
0
4
0
@_weiping
Wei Ping
3 months
RT @ychenNLP: Had a lot of fun to scale up RL to improve math reasoning!. Excited to introduce AceMath-RL-Nemotron-7B with a scalable train….
0
7
0