
Wei Ping
@_weiping
Followers
2K
Following
599
Media
14
Statuses
298
Distinguished research scientist @nvidia | LLM post-training, reasoning, multimodality | generative models for audio. Views are my own.
San Francisco, CA
Joined June 2020
RT @lucas110550: Our released evaluation toolkit can reproduce our AceReason-Nemotron models numbers (see below):. AceReason-Nemotron-1.0-7….
0
4
0
RT @zihan_johan_liu: With stronger SFT backbone, AceReason-Nemotron-1.1-7B significantly outperforms its predecessor and sets a record-high….
0
8
0
RT @MohammadShoeybi: Checkout our detailed study on advancing math and code reasoning using SFT and RL.
0
3
0
RT @GavinNewsom: If they can handcuff a U.S. Senator for asking a question, imagine what they will do to you.
0
63K
0
Pass@1024 results of our RL model (AceReason-Nemotron-7B) and its starting SFT model (DeepSeek-R1-Distill-Qwen-7B) on LiveCodeBench-v6, which features a large answer space and high-quality test cases that are difficult to solve through 'guessing', even with extensive sampling.
Introducing AceReason-Nemotron: Advancing math and code reasoning through reinforcement learning (RL). We propose conducting RL on math-only prompts first, then on code-only prompts. Our key findings include:.- Math-only RL significantly boosts both math and code benchmarks!.-
2
9
54
RT @zihan_johan_liu: Check out our AceReason-Nemotron-14B. 🤗We start with RL training using math-only prompts, the….
0
2
0
RT @NVIDIAAIDev: 📣 Introducing AceReason-Nemotron: Advancing Math and Code Reasoning through Reinforcement Learning (RL). Starting from the….
0
36
0
RT @Alibaba_Qwen: Introducing Qwen3! . We release and open-weight Qwen3, our latest large language models, including 2 MoE models and 6 den….
0
2K
0
RT @zihan_johan_liu: Introducing AceMath-RL-Nemotron-7B, a math reasoning model trained entirely through reinforcement learning from DeepSe….
0
4
0