Pengyu Zhao @zpysky1125 X Profile

Pengyu Zhao

@zpysky1125

Followers

1K

Following

152

Media

0

Statuses

88

LLM Lead @MiniMax__AI MiniMax Agent: https://t.co/WYkuer8tSV

https://t.co/ZGwA2LB7SV

Beijing

Joined February 2024

Don't wanna be here? Send us removal request.

Pengyu Zhao

@zpysky1125

2 days

MiniMax M2 Tech Blog 3: Why Did M2 End Up as a Full Attention Model? On behave of pre-training lead Haohai Sun. ( https://t.co/WH4xOD9KrT) I. Introduction As the lead of MiniMax-M2 pretrain, I've been getting many queries from the community on "Why did you turn back the clock

MiniMax (official)

@MiniMax__AI

4 days

We’re open-sourcing MiniMax M2 — Agent & Code Native, at 8% Claude Sonnet price, ~2x faster ⚡ Global FREE for a limited time via MiniMax Agent & API - Advanced Coding Capability: Engineered for end-to-end developer workflows. Strong capability on a wide-range of applications

20

114

771

Hailuo AI (MiniMax)

@Hailuo_AI

16 hours

🎵 Introducing MiniMax Music 2.0 — Your AI Composer, Singer & Producer 🎤 Lifelike vocals across styles & emotions 🎶 Pop, jazz, blues, rock, folk — even duets & a cappella 🪄 Full 5-min compositions with multi-instrument control ✨ Professional-level sound quality 🎵 Precise

34

67

1K

Pengyu Zhao

@zpysky1125

18 hours

Great! We'll keep improving MiniMax M2 and making open-source models better!

AICodeKing

@aicodeking

18 hours

MiniMax M2 + Claude Code on KingBench Agentic Evaluations: It now scores #2 on my Agentic Evaluations beating GLM-4.6 by a wide margin. It seems to work much better with Claude Code's Tools. Really great model and it's my daily driver now. I haven't tested GLM with CC yet.

4

2

57

Skyler Miao

@SkylerMiao7

19 hours

Thanks @altryne for inviting me, especially in a poor network on my side. It's my first time on a PodCast. Really happy to talk with you guys. It's so cool! Hope you all love our MiniMax Models, M2, Hailuo 2.3, Speech 2.6!

Weights & Biases

@wandb

19 hours

Currently interviewing @SkylerMiao7, Head of Engineering of @MiniMax__AI on their latest release of M2. https://t.co/n8N9Sq6DlF

1

2

49

Pengyu Zhao

@zpysky1125

23 hours

MiniMax M2 Tech Blogs on Huggingface: 1. https://t.co/JKlaRFgNUH 2. https://t.co/hnp7mhvdDu 3.

huggingface.co

Pengyu Zhao

@zpysky1125

2 days

MiniMax M2 Tech Blog 3: Why Did M2 End Up as a Full Attention Model? On behave of pre-training lead Haohai Sun. ( https://t.co/WH4xOD9KrT) I. Introduction As the lead of MiniMax-M2 pretrain, I've been getting many queries from the community on "Why did you turn back the clock

4

26

164

Ivan Fioravanti ᯅ

@ivanfioravanti

5 days

Minimax M2 + Cursor! Model is fast and powerful! I did some video editing to speed up it x2 for this post, but experience is pretty good!

13

10

132

Lisan al Gaib

@scaling01

2 days

I still think that the gap between open-source and proprietary models is getting smaller and closed labs will have to compete more and more on pricing MiniMax-M2 is now the 5th best model on AI Index and it's super cheap only drawback is that the open-source models are still

MiniMax (official)

@MiniMax__AI

4 days

We’re open-sourcing MiniMax M2 — Agent & Code Native, at 8% Claude Sonnet price, ~2x faster ⚡ Global FREE for a limited time via MiniMax Agent & API - Advanced Coding Capability: Engineered for end-to-end developer workflows. Strong capability on a wide-range of applications

26

38

381

Kilo Code

@kilocode

2 days

Usage of MiniMax M2 is skyrocketing in Kilo Code! Looking at early results, it seems error rates are stable with the other open source models on the market. This can often be a sticking point, so great to see the hard work of the MiniMax team paying off.

4

86

Verdent

@verdent_ai

2 days

Verdent now supports @MiniMax__AI! With it, you'll get advanced coding capabilities, high agent performance, and efficient parameter activation, making coding smarter, faster, and more cost-effective. Try MiniMax-M2 on Verdent for VS Code for FREE during the free trial period 👇

4

19

64

Ivan Fioravanti ᯅ

@ivanfioravanti

2 days

MiniMax-M2-4bit MLX Benchmark Results Apple M3 Ultra, 512GB Prompt/Gen/RAM 2k: 563 - 48 t/s - 130.0GB 4k: 676 - 44 t/s - 130.5GB 8k: 670 - 41 t/s - 131.5GB 16k: 587 - 34 t/s - 133.4GB 32k: 453 - 23 t/s - 137.4GB 64k: 308 - 13 t/s - 145.4GB 128k: 177 - 6 t/s - 161.6GB

10

6

88

GDP

@bookwormengr

2 days

Making decisions with imperfect information at the frontier AI labs Please follow @zpysky1125 - lead researcher Minimax AI - creators of M2, the current leading OSS model and first OSS interleaved thinking model to my knowledge. The below blog by @zpysky1125 is a beautiful blog

Pengyu Zhao

@zpysky1125

2 days

MiniMax M2 Tech Blog 3: Why Did M2 End Up as a Full Attention Model? On behave of pre-training lead Haohai Sun. ( https://t.co/WH4xOD9KrT) I. Introduction As the lead of MiniMax-M2 pretrain, I've been getting many queries from the community on "Why did you turn back the clock

1

2

13

Fabio Seixas

@fseixas

1 day

MiniMax M2 está dando o que falar. Aqui a aulinha de LLM de hoje, do LLM tech lead da MiniMax.

Pengyu Zhao

@zpysky1125

2 days

MiniMax M2 Tech Blog 3: Why Did M2 End Up as a Full Attention Model? On behave of pre-training lead Haohai Sun. ( https://t.co/WH4xOD9KrT) I. Introduction As the lead of MiniMax-M2 pretrain, I've been getting many queries from the community on "Why did you turn back the clock

0

1

Sasha Rush

@srush_nlp

2 days

@YisongMiao I wish. I think I meant to tweet this version.

Pengyu Zhao

@zpysky1125

2 days

MiniMax M2 Tech Blog 3: Why Did M2 End Up as a Full Attention Model? On behave of pre-training lead Haohai Sun. ( https://t.co/WH4xOD9KrT) I. Introduction As the lead of MiniMax-M2 pretrain, I've been getting many queries from the community on "Why did you turn back the clock

3

1

12

Piotr Nawrot

@p_nawrot

2 days

> From the perspective of a year ago, a hybrid of Lightning Attention and Full Attention looked just as good as pure full attention. > Did we find a free lunch? Not quite. > The price became clear at larger scales: the model showed obvious weaknesses in complex, multi-hop

Pengyu Zhao

@zpysky1125

2 days

MiniMax M2 Tech Blog 3: Why Did M2 End Up as a Full Attention Model? On behave of pre-training lead Haohai Sun. ( https://t.co/WH4xOD9KrT) I. Introduction As the lead of MiniMax-M2 pretrain, I've been getting many queries from the community on "Why did you turn back the clock

2

20

100

Skyler Miao

@SkylerMiao7

2 days

Hey guys, we are keeping experience huge traffic increase, which leads to service temporary unstable. Adding resource now!

2

58

Lucas Beyer (bl16)

@giffmana

2 days

> There’s no free lunch. > When you reduce the complexity of attention, you pay a price. > The question is, where? This is *exactly* how I typically end my Transformer tutorial. This slide is already 4 years old, I've never updated it, but it still holds:

Pengyu Zhao

@zpysky1125

2 days

MiniMax M2 Tech Blog 3: Why Did M2 End Up as a Full Attention Model? On behave of pre-training lead Haohai Sun. ( https://t.co/WH4xOD9KrT) I. Introduction As the lead of MiniMax-M2 pretrain, I've been getting many queries from the community on "Why did you turn back the clock

35

61

895

𝙩𝙮≃𝙛{𝕩}^A𝕀²·ℙarad𝕚g𝕞

@TaNGSoFT

2 days

thank you for bringing this to us! 这个是难得的值得仔细阅读AI工程的血泪倾诉。让我想起Hofstadter在GEB里对人类智能的那种近乎神圣的敬畏之心。印象中也就Manus那篇对agent上下文管理的，还有姚顺雨和杨植麟的访谈。

Pengyu Zhao

@zpysky1125

2 days

MiniMax M2 Tech Blog 3: Why Did M2 End Up as a Full Attention Model? On behave of pre-training lead Haohai Sun. ( https://t.co/WH4xOD9KrT) I. Introduction As the lead of MiniMax-M2 pretrain, I've been getting many queries from the community on "Why did you turn back the clock

1

2

16

Skyler Miao

@SkylerMiao7

2 days

Great job! Now you can enjoy M2 on Vercel seamlessly! Hope you all love it! Vercel is such an efficient and brilliant team, loving coding and troubleshooting with your guys!

Vercel Developers

@vercel_dev

3 days

MiniMax M2 is now on Vercel AI Gateway. • Free until Nov 7 2025 • Set model to 𝚖𝚒𝚗𝚒𝚖𝚊𝚡/𝚖𝚒𝚗𝚒𝚖𝚊𝚡-𝚖𝟸 • Open source model for agentic use Integrated in collaboration with @MiniMax__AI's engineering team to boost reliability & performance. https://t.co/8IiFPKdqwN

0

1

12

Pengyu Zhao

@zpysky1125

2 days

MiniMax M2 Tech Blog 2: What makes good reasoning data

Jin Zhu

@JinZhu8614

3 days

In the past, community discussions on improving reasoning abilities often focused on optimizing RL algorithms or constructing verifiable data in domains like Math and Code. In the M2 project, we conducted more "general" explorations. As a member of the Reasoning team, I'd like to

0

2

27

Pengyu Zhao

@zpysky1125

2 days

MiniMax M2 Tech Blog 1: Aligning to What? Rethinking Agent Generalization in MiniMax M2

Olive Song

@olive_jy_song

3 days

It's been fantastic to see the community dive into our new MiniMax M2 model, with many celebrating its impressive skills in agentic tasks. In this repost of one of my brilliant colleagues 🙌 - Junheng- 's blog "Aligning to What? Rethinking Agent Generalization in MiniMax M2"

1

3

18