Hao Zhang Profile
Hao Zhang

@haozhangml

Followers
5K
Following
1K
Media
7
Statuses
660

Asst. Prof. @HDSIUCSD and @ucsd_cse running @haoailab. Cofounder and runs @lmsysorg. 20% with @Snowflake

San Francisco
Joined July 2021
Don't wanna be here? Send us removal request.
@haozhangml
Hao Zhang
4 months
Beyond thrilled 🚀 to see my lab's work DistServe (OSDI'24) just got featured in Jensen Huang's keynote at Nvidia GTC!. This marks our third major breakthrough in LLM inference after PagedAttention (vLLM) and Lookahead Decoding — pushing the frontier yet again!. Since we post the.
@vllm_project
vLLM
4 months
Spotted @vllm_project during Jensen's Keynote @nvidia #GTC
Tweet media one
11
24
145
@haozhangml
Hao Zhang
6 days
Pokémon Red has recently emerged as an evaluation benchmark, adopted by several top AI labs. But is it really a good benchmark for evaluating LLM capabilities or guiding LLM research?. We wrote this blog to dive into the challenges, surface the opportunities, and introduce.
@haoailab
Hao AI Lab
6 days
🔥 Pokémon Red is becoming a go-to benchmark for testing advanced AIs such as Gemini. But is Pokémon Red really a good eval? We study this problem and identify three issues:.1️⃣ Navigation tasks are too hard. 2️⃣ Combat control is too simple. 3️⃣ Raising a strong Pokémon team is
0
4
21
@haozhangml
Hao Zhang
19 days
RT @BeidiChen: Say hello to Multiverse — the Everything Everywhere All At Once of generative modeling. 💥 Lossless, adaptive, and gloriousl….
0
21
0
@haozhangml
Hao Zhang
20 days
RT @SemiAnalysis_: Great work to sglang team at @lmsysorg showing the performance gains enabled by:. - High rank EP optimization.- Disaggre….
0
6
0
@haozhangml
Hao Zhang
23 days
Curious how o3-pro performs beyond math & code? We just threw Tetris at it. ❌ Most models: game over after a few moves. ✅ o3-pro: still stacking, basically endless. Big jump in spatial planning. (also much better than other models on the more challenging Sokoban). See the.
@haoailab
Hao AI Lab
23 days
[Lmgame Bench] o3-pro: A Milestone in LLM Gaming! 🕹️. The leap from o3 to o3-pro is bigger than you might have thought. We tested o3-pro on Tetris and Sokoban— achieved SOTA on both and outperformed its previous self by a big margin. 🔍. 🧱 Tetris Update.o3-pro: ✅ 8+ lines
2
0
7
@haozhangml
Hao Zhang
29 days
Latest benchmarking results of claude-4 on games 👇👇.
@haoailab
Hao AI Lab
29 days
[Lmgame Bench] 🎮 New Benchmark Results: Claude-Sonnet-4 and Claude-Opus-4. You asked—we delivered. We tested both models on 5 classic games: 2048, Candy Crush, Sokoban, Tetris, and Ace Attorney. Claude-Opus-4 stands out in Sokoban and Ace Attorney, outperforming Claude-Sonnet-4.
Tweet media one
Tweet media two
0
0
9
@haozhangml
Hao Zhang
30 days
👍👆☝️.
@InfiniAILab
Infini-AI-Lab
30 days
🥳 Happy to share our new work –  Kinetics: Rethinking Test-Time Scaling Laws. 🤔How to effectively build a powerful reasoning agent?. Existing compute-optimal scaling laws suggest 64K thinking tokens + 1.7B model > 32B model. But, It only shows half of the picture!. 🚨 The O(N²)
Tweet media one
0
1
4
@haozhangml
Hao Zhang
1 month
Wondering if the latest open-weight Qwen3 and Deepseek-R1-0528 performs on games? Check this thread out. Also, stay tuned for a new release of our game benchmark soon. 🧑‍🍳👩‍🍳👨‍🍳.
@haoailab
Hao AI Lab
1 month
🔧🤖 New wave of open-source LLMs like Deekseek-R1-0528 and Qwen3-235B-A22B are leveling up with stronger agentic performance. We test them in head-to-head gameplay — the upgraded Deekseek-R1-0528 outsmarts strong reasoning models like o4-mini across several games and it nearly
Tweet media one
Tweet media two
0
0
9
@haozhangml
Hao Zhang
1 month
always inspiring to watch @istoica05 predicting the future! 😃👍.
@aha_ml
Ameer Haj-Ali
1 month
🧵 Just spent an hour with Ion Stoica @istoica05 (Berkeley prof, Databricks/Anyscale co-founder) discussing the future of AI. His insights on execution, China's AI advantage, and what young founders should build next are 🔥.Thread with the best takes 👇.
0
0
4
@haozhangml
Hao Zhang
1 month
Check out shift parallelism we developed at snowflake!.
@AurickQ
Aurick Qiao
1 month
Excited to open-source Shift Parallelism, developed at @Snowflake AI Research for LLM inference!. With it, Arctic Inference + @vllm_project delivers:. 🚀3.4x faster e2e latency & 1.06x higher throughput.🚀1.7x faster generation & 2.25x lower response time.🚀16x higher throughput
Tweet media one
0
1
11
@haozhangml
Hao Zhang
1 month
RT @PY_Z001: I will be giving a talk in @GPU_MODE tomorrow (May 31 12pm PST) about FastVideo/STA/VSA. Come if you're interested!. https://….
0
21
0
@haozhangml
Hao Zhang
1 month
big release from snowflake this week on both text2sql, reasoning, and inference systems!.
@VentureBeat
VentureBeat
1 month
How Snowflake's open-source text-to-SQL and Arctic inference models solve enterprise AI's two biggest deployment headaches
0
0
4
@haozhangml
Hao Zhang
1 month
RT @Snowflake: Solving real enterprise AI pain points! Our AI Research just shared two impactful new open-source efforts:. ➡️ Arctic-Text2S….
0
8
0
@haozhangml
Hao Zhang
1 month
RT @lm_zheng: Blackwell-specific optimizations are cooking! 🚀.
0
5
0
@haozhangml
Hao Zhang
1 month
RT @mbzuai: An exceptional morning at #IFMLaunch! From @EricXing's vision for world models to @YejinChoinka 's insights on "bending scaling….
0
5
0
@haozhangml
Hao Zhang
2 months
Looking forward to seeing chatbot area to move to the next chapter!.
@lmarena_ai
lmarena.ai
2 months
📢We’re excited to share that we’ve raised $100M in seed funding to support LMArena and continue our research on reliable AI. Led by @a16z and UC Investments (@UofCalifornia), we're proud to have the support of those that believe in both the science and the mission. We’re
2
0
21
@haozhangml
Hao Zhang
2 months
RT @tqchenml: #MLSys2025 make sure to attend 10:30am keynote @istoica05 An AI stack: from scaling AI workloads to evaluating LLMs. Checkou….
0
15
0
@haozhangml
Hao Zhang
2 months
FastVideo v1 is here! 🎬. Our FastVideo team have been working hard and cooking up something new ☕️☕️: a unified, programmable API for video generation that simplifies model authoring and integrates various DiT-related optimizations. We hope to make video generation as seamless.
@haoailab
Hao AI Lab
2 months
Announcing FastVideo V1, a unified framework for accelerating video generation. FastVideo V1 offers:.- A simple, consistent Python API.- State of the art model performance optimizations.- Optimized implementations of popular models. Blog:
0
2
30
@haozhangml
Hao Zhang
2 months
Was casually chatting with a few buddies at snow the other day and realized that @Snowflake might just have the best text2sql team and capabilities on the planet NOW? 😎😀🔥. ✅ #1 on BIRD (single-model, an extremely competitive benchmark) — with our own post-trained.
@yao_zhewei
Zhewei Yao
2 months
🚀 Big news! Our collab w/ Snowflake, UCSD & UMD topped the BIRD leaderboard — beating prior SOTA by 2.8% in Text-to-SQL reasoning! RL was tough, but worth it. 📢 Best model coming soon. #AI #LLM #TextToSQL #ReinforcementLearning #Snowflake #UCSD #UMD #NLP #BIRDLeaderboard
Tweet media one
0
0
12
@haozhangml
Hao Zhang
2 months
RT @PyTorch: PyTorch Foundation has expanded into an umbrella foundation. @vllm_project and @DeepSpeedAI have been accepted as hosted proje….
0
46
0