Jasper @zjasper X Profile

Jasper

@zjasper

Followers

15K

Following

7K

Media

421

Statuses

5K

Co-founder and CEO @Hyperbolic_Labs. ex-@avax & ex-@citsecurities. Finished Math PhD in 2yrs @UCBerkeley. Math Olympiad Gold Medalist. Highest honor @PKU1898

https://t.co/NUdiRgQGY1

California, USA

Joined November 2018

Don't wanna be here? Send us removal request.

Jasper

@zjasper

3 months

AI is great at hitting explicit goals, but often at the cost of the hidden ones. Terence Tao just wrote about this. He points out: AI is the ultimate executor of Goodhart’s law, i.e. when a measure becomes the target, it stops measuring what we care about. Take a call center.

65

120

954

Jasper

@zjasper

3 days

Check out the latest report that we have about math evaluations 😉

Yi Ma

@YiMaTweets

3 days

If you care about systematic professional evaluation of mathematical capabilities of AI models, you can always find the latest at the following sites, managed by the GAUSS team: Blog: https://t.co/6hG5sKIZ8j Full report: https://t.co/Ff2WyIsKaQ Github:

2

0

8

Yi Ma

@YiMaTweets

3 days

Evaluation of mathematical capabilities by the latest LLM models, and comparison with that by humans.

Jasper

@zjasper

4 days

[2/n] Benchmarking results: DeepSeek-Math-V2 wins on accuracy and mean absolute error (MAE); GPT-5 wins on pearson correlation; Gemini-3-Pro is within top-3 on three metrics. See more in: Blog: https://t.co/1KXUxRUpex Full report: https://t.co/NvktvxMI87 Github:

1

6

23

Jasper

@zjasper

3 days

Now you can set up organizations for your team on @hyperbolic_labs!

Hyperbolic

@hyperbolic_labs

3 days

Hyperbolic Organizations are now live. 👇🏻 A unified, secure way for teams to build AI together without shared credentials, scattered billing, or unclear usage. Organizations centralize access, governance, and spend across all AI workflows.

1

0

5

Jasper

@zjasper

4 days

[6/n] Work done with great team @tianzhec @jiaxin @liao_zhen53785, Qiuyu Ren, Tahsin Saffat, @ZitongYang0 and @YiMaTweets @hyperbolic_labs's H200 GPU node makes things happen: fast & bug-free during deploying DeepSeek-Math-V2.

1

0

6

Jasper

@zjasper

4 days

[5/n] Finding 3: LLM grades more diversely than human DeepSeek-Math-V2 aligns exceptionally well with human (both of them gives a lot of 0 grades 🤣) on the metric but all other models grade more diversely with 1.6–2.0 Entropy Ratio and 1.3-1.9 Relative Variance.

1

0

5

Jasper

@zjasper

4 days

[4/n] Finding 2: Grading precision correlates with problem Overall good (GPT-5), medium (DeepSeek-Chat-V3.1), and bad (Qwen3-235B-A22B-Thinking) models have a similar precision trend: all of them are relatively good on P2, P3, and P5 but bad on P4.

1

0

3

Jasper

@zjasper

4 days

[3/n] Finding 1: Most LLMs are lenient but DeepSeek-Math-V2 is strict LLMs (on average) grade less accurate (higher MAE scores) on subset with no valid reasoning answers. 3 example models tend to give higher scores than human according to the confusion matrices.

1

0

4

Jasper

@zjasper

4 days

[2/n] Benchmarking results: DeepSeek-Math-V2 wins on accuracy and mean absolute error (MAE); GPT-5 wins on pearson correlation; Gemini-3-Pro is within top-3 on three metrics. See more in: Blog: https://t.co/1KXUxRUpex Full report: https://t.co/NvktvxMI87 Github:

1

0

7

Jasper

@zjasper

4 days

Is @deepseek_ai the new king of Math grading? By benchmarking LLM-as-a-judge on USAMO 2025, we find that: 🟠 DeepSeek-Math V2 achieves the highest accuracy and most closely aligns with human graders when the submitted answer shows no meaningful progress. 🔵 Gemini-3-Pro

5

6

105

Jasper

@zjasper

4 days

Congrats on the launch and excited to support!

Zengyi Qin

@qinzytech

5 days

Introducing Lux, the most powerful and fastest Computer Use model, built by OpenAGI Foundation @agiopen_org Lux outperforms Google Gemini CUA, OpenAI Operator and Anthropic Claude on benchmark with 300 real-world tasks. Try our developer-friendly SDK to build powerful,

1

0

7

Jasper

@zjasper

7 days

@hyperbolic_labs Here are deepseek-math-v2's solutions to CMO 2025. h/t @TianzheC https://t.co/ZkESnPjpN3

github.com

Contribute to Gauss-Math/DeepSeek-math-v2-results development by creating an account on GitHub.

1

0

Jasper

@zjasper

8 days

We got deepseek-math-v2 running on 8xH200 node on @hyperbolic_labs on-demand GPU platform. Feel free to reply with any math problems that you want to know and I can share the answers. An exciting time to own the brain of one of the best mathematicians!

clem 🤗

@ClementDelangue

9 days

As far as I know, there isn't any chatbot or API that gives you access to an IMO 2025 gold-medalist model. Not only does this change today, but you get to download the weights with the Apache 2.0 open-source release of @deepseek_ai Math-V2 on @huggingface! Imagine owning the

30

16

240

Jasper

@zjasper

9 days

DeepSeek dropped their latest AI research again on a holiday. DeepSeek-Math-V2 is the first open AI model that can win gold at IMO 2025 and beat Gemini on IMO-ProofBench. They’re using a generator–verifier architecture that feels like GAN in the early days. - first train a

Zhihong Shao

@zhs05232838

9 days

We just shared some thoughts and results on self-verifiable mathematical reasoning. The released model, DeepSeekMath-V2, is strong on IMO-ProofBench and competitions like IMO 2025 (5/6 problems) and Putnam 2024 (a near-perfect score of 118/120). Github: https://t.co/4dMEqWxXfU

8

19

185

Hyperbolic

@hyperbolic_labs

19 days

⚡ Flash Sale (Until we sell out again)

2

12

Jasper

@zjasper

1 month

Excited to be a launch partner for @nvidia Nemotron Nano 2 VL! Looking forward to seeing the creative use cases built on top of this model 🔥

Hyperbolic

@hyperbolic_labs

1 month

Excited to announce NVIDIA’s Nemotron Models (@nvidia) on Hyperbolic! A powerful new family of open models, datasets, and techniques designed to help teams build high-accuracy, specialized agentic AI.

1

0

15

Jasper

@zjasper

1 month

Grab these H100/H200 while they are still available!

Hyperbolic

@hyperbolic_labs

1 month

We've been at capacity the past couple days, but just launched more supply in our on-demand: > H100s SXM @ $1.49/hr > H200s SXM @ $2.00/hr

1

0

8

Hyperbolic

@hyperbolic_labs

2 months

Ready to Compete for Compute? ♠️💻 Join us Oct 28th in SF for Poker Night [Compute Edition], hosted by @hyperbolic_ai × @join_ef × @_ai_collective. No buy-ins. No stakes. Just (free) compute and fun! 🎟 Apply to join → https://t.co/lPSjBmGCyV

3

18

Jasper

@zjasper

2 months

Only in SF: Had breakfast with @matistanis, cofounder & CEO of ElevenLabs and learned how they scaled the team to 300+. He shared his weekly breakdown: • 25% hiring • 25–50% sales (and generate product insights) • 25% misc And every Saturday, he personally tests new product

6

125

rLLM

@rllm_project

2 months

🚀 Introducing rLLM v0.2 - train arbitrary agentic programs with RL, with minimal code changes. Most RL training systems adopt the agent-environment abstraction. But what about complex workflows? Think solver-critique pairs collaborating, or planner agents orchestrating multiple

3

29

138

Yuchen Jin

@Yuchenj_UW

2 months

Andrej Karpathy released nanochat, ~8K lines of minimal code that do pretrain + midtrain + SFT + RL + inference + ChatGPT-like webUI. It trains a 560M LLM in ~4 hrs on 8×H100. I trained and hosted it on Hyperbolic GPUs ($48). First prompt reminded me how funny tiny LLMs are.

66

139

3K