
Weizhu Chen
@WeizhuChen
Followers
3K
Following
548
Media
18
Statuses
165
Happy to see @ypwang61 and the team did some interesting work here.
We only need ONE example for RLVR on LLMs to achieve significant improvement on math tasks!. 📍RLVR with one training example can boost:. - Qwen2.5-Math-1.5B: 36.0% → 73.6%. - Qwen2.5-Math-7B: 51.0% → 79.2% . on MATH500. 📄 Paper:
0
1
4
RT @JeffDean: @moderncpp7 @clu_cheng @NeurIPSConf @drfeifei @jhyuxm @edchi I didn't see the talk, but the images I've seen of the slide see….
0
159
0
A big shout out to our amazing interns @zebgou and @Zhenghaolin1 , my colleagues @yynlpanda @XiaoLiuNLP @ShenYelong. They did most of the work in this paper.
0
0
2
Excited to share our paper "Not All Tokens Are What You Need for Pretraining" received the #NeurIPS2024 Best Paper Runner-up award. I’ll be attending the conference from Wed to Sat. Feel free to reach out if you'd like to connect or attend our talk.
13
33
320
Check the 5 new Phi-3 models we published this morning, much more powerful SLM with long context. New: Phi-3-vision is a 4.2B parameter multimodal model with language and vision capabilities. New: Phi-3-small is a 7B parameter language model, available in two context lengths.
Phi-3 small & medium are now available under the MIT license! 🚀@Microsoft has just launched Phi-3 small (7B) and medium (14B) 🤯. The Phi-3 small model claims to outperform @AIatMeta's Llama 3 and @MistralAI, and the Phi-3 medium model GPT-3.5 and @cohere Command R+. 🤔. TL;DR:
1
5
36