Weizhu Chen Profile
Weizhu Chen

@WeizhuChen

Followers
3K
Following
548
Media
18
Statuses
165

Microsoft

Kirkland WA
Joined April 2008
Don't wanna be here? Send us removal request.
@WeizhuChen
Weizhu Chen
24 days
Synthesizing challenging problems that current model performs poorly is an important area in RL. Another thing interests me is the self-evolve learning via synthesizing questions/problems that the model can learn continuously. You may check our work here:
Tweet media one
2
3
23
@WeizhuChen
Weizhu Chen
2 months
Glad to see the team used a 3.8B model (Phi-4-mini-reasoning) to achieve 94.6 in Math-500 and 57.5 in AIME-24. arxiv: hf: Azure:
Tweet media one
2
4
26
@WeizhuChen
Weizhu Chen
2 months
Happy to see @ypwang61 and the team did some interesting work here.
@ypwang61
Yiping Wang
2 months
We only need ONE example for RLVR on LLMs to achieve significant improvement on math tasks!. 📍RLVR with one training example can boost:. - Qwen2.5-Math-1.5B: 36.0% → 73.6%. - Qwen2.5-Math-7B: 51.0% → 79.2% . on MATH500. 📄 Paper:
Tweet media one
0
1
4
@WeizhuChen
Weizhu Chen
4 months
Check out our tech report of the phi4 mini and multimodality.
@_akhaliq
AK
4 months
Phi-4-Mini Technical Report. Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs
Tweet media one
2
9
46
@WeizhuChen
Weizhu Chen
4 months
We released Phi-4-mini (3.8B base in LLM), a new SLM excelling in language, vision, and audio through a mixture-of-LoRA, uniting three modalities in one model. I am so impressed with its new audio capability. I hope you can play with it and share with us your feedback. We also
Tweet media one
48
144
733
@WeizhuChen
Weizhu Chen
7 months
+1 on this.
0
0
9
@WeizhuChen
Weizhu Chen
7 months
RT @JeffDean: @moderncpp7 @clu_cheng @NeurIPSConf @drfeifei @jhyuxm @edchi I didn't see the talk, but the images I've seen of the slide see….
0
159
0
@WeizhuChen
Weizhu Chen
7 months
A big shout out to our amazing interns @zebgou and @Zhenghaolin1 , my colleagues @yynlpanda @XiaoLiuNLP @ShenYelong. They did most of the work in this paper.
0
0
2
@WeizhuChen
Weizhu Chen
7 months
Excited to share our paper "Not All Tokens Are What You Need for Pretraining" received the #NeurIPS2024 Best Paper Runner-up award. I’ll be attending the conference from Wed to Sat. Feel free to reach out if you'd like to connect or attend our talk.
Tweet media one
13
33
320
@WeizhuChen
Weizhu Chen
11 months
All 3 models are now live on Azure AI Studio too: .Phi-3.5-mini-instruct: Phi-3.5-MoE-instruct: Phi-3.5-vision-instruct:
2
4
16
@WeizhuChen
Weizhu Chen
11 months
We released phi 3.5: mini+MoE+vision. A better mini model with multilingual support: A new MoE model: A new vision model supporting multiple images:
14
117
473
@WeizhuChen
Weizhu Chen
1 year
We updated Phi-3 mini in our June release, with the enhancement in instruction following, reasoning in MMLU (70.9)/GPQA(30.6), and better long context. Share with us your feedback on the new models.
Tweet media one
7
21
125
@WeizhuChen
Weizhu Chen
1 year
Check the 5 new Phi-3 models we published this morning, much more powerful SLM with long context. New: Phi-3-vision is a 4.2B parameter multimodal model with language and vision capabilities. New: Phi-3-small is a 7B parameter language model, available in two context lengths.
@_philschmid
Philipp Schmid
1 year
Phi-3 small & medium are now available under the MIT license! 🚀@Microsoft has just launched Phi-3 small (7B) and medium (14B) 🤯. The Phi-3 small model claims to outperform @AIatMeta's Llama 3 and @MistralAI, and the Phi-3 medium model GPT-3.5 and @cohere Command R+. 🤔. TL;DR:
Tweet media one
1
5
36
@WeizhuChen
Weizhu Chen
1 year
GenAI team just moved to Building 99. Love to come back to this building. If you are in B99, let’s grab a drink.
Tweet media one
6
4
232
@WeizhuChen
Weizhu Chen
1 year
I would like to invite you to try phi-3-mini: You can also download the weights from HF with more model weights on the way. Besides what was described in technical report, one specific thing I want to mention is the 128K context support. It takes us a.
4
19
160