Pengyu Zhao Profile
Pengyu Zhao

@zpysky1125

Followers
1K
Following
152
Media
0
Statuses
88

LLM Lead @MiniMax__AI MiniMax Agent: https://t.co/WYkuer8tSV

Beijing
Joined February 2024
Don't wanna be here? Send us removal request.
@zpysky1125
Pengyu Zhao
2 days
MiniMax M2 Tech Blog 3: Why Did M2 End Up as a Full Attention Model? On behave of pre-training lead Haohai Sun. ( https://t.co/WH4xOD9KrT) I. Introduction As the lead of MiniMax-M2 pretrain, I've been getting many queries from the community on "Why did you turn back the clock
@MiniMax__AI
MiniMax (official)
4 days
We’re open-sourcing MiniMax M2 — Agent & Code Native, at 8% Claude Sonnet price, ~2x faster ⚡ Global FREE for a limited time via MiniMax Agent & API - Advanced Coding Capability: Engineered for end-to-end developer workflows. Strong capability on a wide-range of applications
20
114
771
@Hailuo_AI
Hailuo AI (MiniMax)
16 hours
🎵 Introducing MiniMax Music 2.0 — Your AI Composer, Singer & Producer 🎤 Lifelike vocals across styles & emotions 🎶 Pop, jazz, blues, rock, folk — even duets & a cappella 🪄 Full 5-min compositions with multi-instrument control ✨ Professional-level sound quality 🎵 Precise
34
67
1K
@zpysky1125
Pengyu Zhao
18 hours
Great! We'll keep improving MiniMax M2 and making open-source models better!
@aicodeking
AICodeKing
18 hours
MiniMax M2 + Claude Code on KingBench Agentic Evaluations: It now scores #2 on my Agentic Evaluations beating GLM-4.6 by a wide margin. It seems to work much better with Claude Code's Tools. Really great model and it's my daily driver now. I haven't tested GLM with CC yet.
4
2
57
@SkylerMiao7
Skyler Miao
19 hours
Thanks @altryne for inviting me, especially in a poor network on my side. It's my first time on a PodCast. Really happy to talk with you guys. It's so cool! Hope you all love our MiniMax Models, M2, Hailuo 2.3, Speech 2.6!
@wandb
Weights & Biases
19 hours
Currently interviewing @SkylerMiao7, Head of Engineering of @MiniMax__AI on their latest release of M2. https://t.co/n8N9Sq6DlF
1
2
49
@zpysky1125
Pengyu Zhao
23 hours
MiniMax M2 Tech Blogs on Huggingface: 1. https://t.co/JKlaRFgNUH 2. https://t.co/hnp7mhvdDu 3.
Tweet card summary image
huggingface.co
@zpysky1125
Pengyu Zhao
2 days
MiniMax M2 Tech Blog 3: Why Did M2 End Up as a Full Attention Model? On behave of pre-training lead Haohai Sun. ( https://t.co/WH4xOD9KrT) I. Introduction As the lead of MiniMax-M2 pretrain, I've been getting many queries from the community on "Why did you turn back the clock
4
26
164
@ivanfioravanti
Ivan Fioravanti ᯅ
5 days
Minimax M2 + Cursor! Model is fast and powerful! I did some video editing to speed up it x2 for this post, but experience is pretty good!
13
10
132
@scaling01
Lisan al Gaib
2 days
I still think that the gap between open-source and proprietary models is getting smaller and closed labs will have to compete more and more on pricing MiniMax-M2 is now the 5th best model on AI Index and it's super cheap only drawback is that the open-source models are still
@MiniMax__AI
MiniMax (official)
4 days
We’re open-sourcing MiniMax M2 — Agent & Code Native, at 8% Claude Sonnet price, ~2x faster ⚡ Global FREE for a limited time via MiniMax Agent & API - Advanced Coding Capability: Engineered for end-to-end developer workflows. Strong capability on a wide-range of applications
26
38
381
@kilocode
Kilo Code
2 days
Usage of MiniMax M2 is skyrocketing in Kilo Code! Looking at early results, it seems error rates are stable with the other open source models on the market. This can often be a sticking point, so great to see the hard work of the MiniMax team paying off.
4
4
86
@verdent_ai
Verdent
2 days
Verdent now supports @MiniMax__AI! With it, you'll get advanced coding capabilities, high agent performance, and efficient parameter activation, making coding smarter, faster, and more cost-effective. Try MiniMax-M2 on Verdent for VS Code for FREE during the free trial period 👇
4
19
64
@ivanfioravanti
Ivan Fioravanti ᯅ
2 days
MiniMax-M2-4bit MLX Benchmark Results Apple M3 Ultra, 512GB Prompt/Gen/RAM 2k: 563 - 48 t/s - 130.0GB 4k: 676 - 44 t/s - 130.5GB 8k: 670 - 41 t/s - 131.5GB 16k: 587 - 34 t/s - 133.4GB 32k: 453 - 23 t/s - 137.4GB 64k: 308 - 13 t/s - 145.4GB 128k: 177 - 6 t/s - 161.6GB
10
6
88
@bookwormengr
GDP
2 days
Making decisions with imperfect information at the frontier AI labs Please follow @zpysky1125 - lead researcher Minimax AI - creators of M2, the current leading OSS model and first OSS interleaved thinking model to my knowledge. The below blog by @zpysky1125 is a beautiful blog
@zpysky1125
Pengyu Zhao
2 days
MiniMax M2 Tech Blog 3: Why Did M2 End Up as a Full Attention Model? On behave of pre-training lead Haohai Sun. ( https://t.co/WH4xOD9KrT) I. Introduction As the lead of MiniMax-M2 pretrain, I've been getting many queries from the community on "Why did you turn back the clock
1
2
13
@fseixas
Fabio Seixas
1 day
MiniMax M2 está dando o que falar. Aqui a aulinha de LLM de hoje, do LLM tech lead da MiniMax.
@zpysky1125
Pengyu Zhao
2 days
MiniMax M2 Tech Blog 3: Why Did M2 End Up as a Full Attention Model? On behave of pre-training lead Haohai Sun. ( https://t.co/WH4xOD9KrT) I. Introduction As the lead of MiniMax-M2 pretrain, I've been getting many queries from the community on "Why did you turn back the clock
0
1
1
@srush_nlp
Sasha Rush
2 days
@YisongMiao I wish. I think I meant to tweet this version.
@zpysky1125
Pengyu Zhao
2 days
MiniMax M2 Tech Blog 3: Why Did M2 End Up as a Full Attention Model? On behave of pre-training lead Haohai Sun. ( https://t.co/WH4xOD9KrT) I. Introduction As the lead of MiniMax-M2 pretrain, I've been getting many queries from the community on "Why did you turn back the clock
3
1
12
@p_nawrot
Piotr Nawrot
2 days
> From the perspective of a year ago, a hybrid of Lightning Attention and Full Attention looked just as good as pure full attention. > Did we find a free lunch? Not quite. > The price became clear at larger scales: the model showed obvious weaknesses in complex, multi-hop
@zpysky1125
Pengyu Zhao
2 days
MiniMax M2 Tech Blog 3: Why Did M2 End Up as a Full Attention Model? On behave of pre-training lead Haohai Sun. ( https://t.co/WH4xOD9KrT) I. Introduction As the lead of MiniMax-M2 pretrain, I've been getting many queries from the community on "Why did you turn back the clock
2
20
100
@SkylerMiao7
Skyler Miao
2 days
Hey guys, we are keeping experience huge traffic increase, which leads to service temporary unstable. Adding resource now!
2
2
58
@giffmana
Lucas Beyer (bl16)
2 days
> There’s no free lunch. > When you reduce the complexity of attention, you pay a price. > The question is, where? This is *exactly* how I typically end my Transformer tutorial. This slide is already 4 years old, I've never updated it, but it still holds:
@zpysky1125
Pengyu Zhao
2 days
MiniMax M2 Tech Blog 3: Why Did M2 End Up as a Full Attention Model? On behave of pre-training lead Haohai Sun. ( https://t.co/WH4xOD9KrT) I. Introduction As the lead of MiniMax-M2 pretrain, I've been getting many queries from the community on "Why did you turn back the clock
35
61
895
@TaNGSoFT
𝙩𝙮≃𝙛{𝕩}^A𝕀²·ℙarad𝕚g𝕞
2 days
thank you for bringing this to us! 这个是难得的值得仔细阅读AI工程的血泪倾诉。 让我想起Hofstadter在GEB里对人类智能的那种近乎神圣的敬畏之心。 印象中也就Manus那篇对agent上下文管理的,还有姚顺雨和杨植麟的访谈。
@zpysky1125
Pengyu Zhao
2 days
MiniMax M2 Tech Blog 3: Why Did M2 End Up as a Full Attention Model? On behave of pre-training lead Haohai Sun. ( https://t.co/WH4xOD9KrT) I. Introduction As the lead of MiniMax-M2 pretrain, I've been getting many queries from the community on "Why did you turn back the clock
1
2
16
@SkylerMiao7
Skyler Miao
2 days
Great job! Now you can enjoy M2 on Vercel seamlessly! Hope you all love it! Vercel is such an efficient and brilliant team, loving coding and troubleshooting with your guys!
@vercel_dev
Vercel Developers
3 days
MiniMax M2 is now on Vercel AI Gateway. • Free until Nov 7 2025 • Set model to 𝚖𝚒𝚗𝚒𝚖𝚊𝚡/𝚖𝚒𝚗𝚒𝚖𝚊𝚡-𝚖𝟸 • Open source model for agentic use Integrated in collaboration with @MiniMax__AI's engineering team to boost reliability & performance. https://t.co/8IiFPKdqwN
0
1
12
@zpysky1125
Pengyu Zhao
2 days
MiniMax M2 Tech Blog 2: What makes good reasoning data
@JinZhu8614
Jin Zhu
3 days
In the past, community discussions on improving reasoning abilities often focused on optimizing RL algorithms or constructing verifiable data in domains like Math and Code. In the M2 project, we conducted more "general" explorations. As a member of the Reasoning team, I'd like to
0
2
27
@zpysky1125
Pengyu Zhao
2 days
MiniMax M2 Tech Blog 1: Aligning to What? Rethinking Agent Generalization in MiniMax M2
@olive_jy_song
Olive Song
3 days
It's been fantastic to see the community dive into our new MiniMax M2 model, with many celebrating its impressive skills in agentic tasks. In this repost of one of my brilliant colleagues 🙌 - Junheng- 's blog "Aligning to What? Rethinking Agent Generalization in MiniMax M2"
1
3
18