Hao Zhang @haozhangml X Profile

Hao Zhang

@haozhangml

Followers

5K

Following

1K

Media

7

Statuses

660

Asst. Prof. @HDSIUCSD and @ucsd_cse running @haoailab. Cofounder and runs @lmsysorg. 20% with @Snowflake

San Francisco

Joined July 2021

Don't wanna be here? Send us removal request.

Hao Zhang

@haozhangml

4 months

Beyond thrilled 🚀 to see my lab's work DistServe (OSDI'24) just got featured in Jensen Huang's keynote at Nvidia GTC!. This marks our third major breakthrough in LLM inference after PagedAttention (vLLM) and Lookahead Decoding — pushing the frontier yet again!. Since we post the.

vLLM

@vllm_project

4 months

Spotted @vllm_project during Jensen's Keynote @nvidia #GTC

11

24

145

Hao Zhang

@haozhangml

6 days

Pokémon Red has recently emerged as an evaluation benchmark, adopted by several top AI labs. But is it really a good benchmark for evaluating LLM capabilities or guiding LLM research?. We wrote this blog to dive into the challenges, surface the opportunities, and introduce.

Hao AI Lab

@haoailab

6 days

🔥 Pokémon Red is becoming a go-to benchmark for testing advanced AIs such as Gemini. But is Pokémon Red really a good eval? We study this problem and identify three issues:.1️⃣ Navigation tasks are too hard. 2️⃣ Combat control is too simple. 3️⃣ Raising a strong Pokémon team is

0

4

21

Hao Zhang

@haozhangml

19 days

RT @BeidiChen: Say hello to Multiverse — the Everything Everywhere All At Once of generative modeling. 💥 Lossless, adaptive, and gloriousl….

0

21

0

Hao Zhang

@haozhangml

20 days

RT @SemiAnalysis_: Great work to sglang team at @lmsysorg showing the performance gains enabled by:. - High rank EP optimization.- Disaggre….

0

6

0

Hao Zhang

@haozhangml

23 days

Curious how o3-pro performs beyond math & code? We just threw Tetris at it. ❌ Most models: game over after a few moves. ✅ o3-pro: still stacking, basically endless. Big jump in spatial planning. (also much better than other models on the more challenging Sokoban). See the.

Hao AI Lab

@haoailab

23 days

[Lmgame Bench] o3-pro: A Milestone in LLM Gaming! 🕹️. The leap from o3 to o3-pro is bigger than you might have thought. We tested o3-pro on Tetris and Sokoban— achieved SOTA on both and outperformed its previous self by a big margin. 🔍. 🧱 Tetris Update.o3-pro: ✅ 8+ lines

2

0

7

Hao Zhang

@haozhangml

29 days

Latest benchmarking results of claude-4 on games 👇👇.

Hao AI Lab

@haoailab

29 days

[Lmgame Bench] 🎮 New Benchmark Results: Claude-Sonnet-4 and Claude-Opus-4. You asked—we delivered. We tested both models on 5 classic games: 2048, Candy Crush, Sokoban, Tetris, and Ace Attorney. Claude-Opus-4 stands out in Sokoban and Ace Attorney, outperforming Claude-Sonnet-4.

0

9

Hao Zhang

@haozhangml

30 days

👍👆☝️.

Infini-AI-Lab

@InfiniAILab

30 days

🥳 Happy to share our new work – Kinetics: Rethinking Test-Time Scaling Laws. 🤔How to effectively build a powerful reasoning agent?. Existing compute-optimal scaling laws suggest 64K thinking tokens + 1.7B model > 32B model. But, It only shows half of the picture!. 🚨 The O(N²)

0

1

4

Hao Zhang

@haozhangml

1 month

Wondering if the latest open-weight Qwen3 and Deepseek-R1-0528 performs on games? Check this thread out. Also, stay tuned for a new release of our game benchmark soon. 🧑‍🍳👩‍🍳👨‍🍳.

Hao AI Lab

@haoailab

1 month

🔧🤖 New wave of open-source LLMs like Deekseek-R1-0528 and Qwen3-235B-A22B are leveling up with stronger agentic performance. We test them in head-to-head gameplay — the upgraded Deekseek-R1-0528 outsmarts strong reasoning models like o4-mini across several games and it nearly

0

9

Hao Zhang

@haozhangml

1 month

always inspiring to watch @istoica05 predicting the future! 😃👍.

Ameer Haj-Ali

@aha_ml

1 month

🧵 Just spent an hour with Ion Stoica @istoica05 (Berkeley prof, Databricks/Anyscale co-founder) discussing the future of AI. His insights on execution, China's AI advantage, and what young founders should build next are 🔥.Thread with the best takes 👇.

0

4

Hao Zhang

@haozhangml

1 month

Check out shift parallelism we developed at snowflake!.

Aurick Qiao

@AurickQ

1 month

Excited to open-source Shift Parallelism, developed at @Snowflake AI Research for LLM inference!. With it, Arctic Inference + @vllm_project delivers:. 🚀3.4x faster e2e latency & 1.06x higher throughput.🚀1.7x faster generation & 2.25x lower response time.🚀16x higher throughput

0

1

11

Hao Zhang

@haozhangml

1 month

RT @PY_Z001: I will be giving a talk in @GPU_MODE tomorrow (May 31 12pm PST) about FastVideo/STA/VSA. Come if you're interested!. https://….

0

21

0

Hao Zhang

@haozhangml

1 month

big release from snowflake this week on both text2sql, reasoning, and inference systems!.

VentureBeat

@VentureBeat

1 month

How Snowflake's open-source text-to-SQL and Arctic inference models solve enterprise AI's two biggest deployment headaches

0

4

Hao Zhang

@haozhangml

1 month

RT @Snowflake: Solving real enterprise AI pain points! Our AI Research just shared two impactful new open-source efforts:. ➡️ Arctic-Text2S….

0

8

0

Hao Zhang

@haozhangml

1 month

RT @lm_zheng: Blackwell-specific optimizations are cooking! 🚀.

0

5

0

Hao Zhang

@haozhangml

1 month

RT @mbzuai: An exceptional morning at #IFMLaunch! From @EricXing's vision for world models to @YejinChoinka 's insights on "bending scaling….

0

5

0

Hao Zhang

@haozhangml

2 months

Looking forward to seeing chatbot area to move to the next chapter!.

lmarena.ai

@lmarena_ai

2 months

📢We’re excited to share that we’ve raised $100M in seed funding to support LMArena and continue our research on reliable AI. Led by @a16z and UC Investments (@UofCalifornia), we're proud to have the support of those that believe in both the science and the mission. We’re

2

0

21

Hao Zhang

@haozhangml

2 months

RT @tqchenml: #MLSys2025 make sure to attend 10:30am keynote @istoica05 An AI stack: from scaling AI workloads to evaluating LLMs. Checkou….

0

15

0

Hao Zhang

@haozhangml

2 months

FastVideo v1 is here! 🎬. Our FastVideo team have been working hard and cooking up something new ☕️☕️: a unified, programmable API for video generation that simplifies model authoring and integrates various DiT-related optimizations. We hope to make video generation as seamless.

Hao AI Lab

@haoailab

2 months

Announcing FastVideo V1, a unified framework for accelerating video generation. FastVideo V1 offers:.- A simple, consistent Python API.- State of the art model performance optimizations.- Optimized implementations of popular models. Blog:

0

2

30

Hao Zhang

@haozhangml

2 months

Was casually chatting with a few buddies at snow the other day and realized that @Snowflake might just have the best text2sql team and capabilities on the planet NOW? 😎😀🔥. ✅ #1 on BIRD (single-model, an extremely competitive benchmark) — with our own post-trained.

Zhewei Yao

@yao_zhewei

2 months

🚀 Big news! Our collab w/ Snowflake, UCSD & UMD topped the BIRD leaderboard — beating prior SOTA by 2.8% in Text-to-SQL reasoning! RL was tough, but worth it. 📢 Best model coming soon. #AI #LLM #TextToSQL #ReinforcementLearning #Snowflake #UCSD #UMD #NLP #BIRDLeaderboard

0

12

Hao Zhang

@haozhangml

2 months

RT @PyTorch: PyTorch Foundation has expanded into an umbrella foundation. @vllm_project and @DeepSpeedAI have been accepted as hosted proje….

0

46

0