haoailab Profile Banner
Hao AI Lab Profile
Hao AI Lab

@haoailab

Followers
4K
Following
641
Media
142
Statuses
392

Hao AI Lab at UCSD. Our mission is to democratize large machine learning models, algorithms, and their underlying systems.

Joined March 2024
Don't wanna be here? Send us removal request.
@haoailab
Hao AI Lab
8 days
🔥 New Blog: “Disaggregated Inference: 18 Months Later” 18 months in LLM inference feels like a new Moore’s Law cycle – but this time not just 2x per year: 💸 Serving cost ↓10–100x 🚀 Throughput ↑10x ⚡ Latency ↓5x A big reason? Disaggregated Inference. From DistServe, our
Tweet card summary image
hao-ai-lab.github.io
Eighteen months ago, our lab introduced DistServe with a simple bet: split LLM inference into prefill and decode, and scale them independently on separate compute pools. Today, almost every product...
5
47
170
@Lanxiang_Hu
Lanxiang Hu
4 days
Exciting time to work on parallel generation in our lab: strong quality and faster generation, fully leveraging modern hardware. 🚀
@haozhangml
Hao Zhang
4 days
Excited to partner with SGLang: FastVideo + SGLang = the future open ecosystem for diffusion. 🥳🫡 ----------- A few extra cents: Since I started faculty at UCSD, our lab has been investing diffusion for video and text , and in both algorithms and systems. - Text-side, we
0
1
6
@lm_zheng
Lianmin Zheng
4 days
Hao has been pioneering efficient architecture research for many years. Always eager to see the innovations from him and his group!
@haozhangml
Hao Zhang
4 days
Excited to partner with SGLang: FastVideo + SGLang = the future open ecosystem for diffusion. 🥳🫡 ----------- A few extra cents: Since I started faculty at UCSD, our lab has been investing diffusion for video and text , and in both algorithms and systems. - Text-side, we
0
3
83
@haozhangml
Hao Zhang
4 days
Excited to partner with SGLang: FastVideo + SGLang = the future open ecosystem for diffusion. 🥳🫡 ----------- A few extra cents: Since I started faculty at UCSD, our lab has been investing diffusion for video and text , and in both algorithms and systems. - Text-side, we
Tweet card summary image
hao-ai-lab.github.io
TL;DR: LLMs have been traditionally regarded as sequential decoders, decoding one token after another. In this blog, we show pretrained LLMs can be easily taught to operate as efficient parallel...
@haoailab
Hao AI Lab
4 days
Exciting to partner with SGL (@lmsysorg ). FastVideo + SGL = the future open source ecosystem for diffusion! 🥳🥳
3
8
100
@haoailab
Hao AI Lab
4 days
Exciting to partner with SGL (@lmsysorg ). FastVideo + SGL = the future open source ecosystem for diffusion! 🥳🥳
@lmsysorg
LMSYS Org
4 days
🚀 Introducing SGLang Diffusion — bringing SGLang’s high-performance serving to diffusion models. ⚡️ Up to 5.9× faster inference 🧩 Supports major open-source models: Wan, Hunyuan, Qwen-Image, Qwen-Image-Edit, Flux 🧰 Easy to use via OpenAI-compatible API, CLI & Python API
1
6
25
@haoailab
Hao AI Lab
12 days
♠️♥️ Day 3 — Final Showdown! Our last day of the LLM Texas Hold’em tournament is live 🎥 📊 Current TrueSkill 2 Top 3: Grok-4-0709 > Gemini-2.5-Pro >GPT-5 (2025-08-07) Same prompt every day — around 20 hands/day, we will provide final TrueSkill2 ranking after today’s games!
0
2
8
@haoailab
Hao AI Lab
13 days
[Lmgame Bench] Day 2 Recap ♠️♥️ Chip Standings + Rank Changes 🎲 Each day includes ~20 rounds, so rank shifts may reflect short-term variance rather than stable strategy change. Final TrueSkill2 after full 60 rounds will tell more. 📊Ranks 1️⃣ Gemini-2.5-Pro 359 ⬆️ (+5) 2️⃣
1
3
8
@haoailab
Hao AI Lab
14 days
♠️♥️ Texas Hold’em LLM tournament Day 2 is live! 🆕 New layout: each model’s thought now shown on the right side. Here’s Day 1 chip results 🪙 — final TrueSkill2 rankings will be posted after the tournament ends. 1️⃣ GPT-5 — 336 2️⃣ Grok-4 — 305 3️⃣ Kimi-K2 — 304 4️⃣
0
3
12
@haoailab
Hao AI Lab
15 days
♠️♥️ The cards are on the table. Day 1 of our 3-day Texas Hold’em LLM tournament is live! 😍 🤖 6 models. 300 chips each. No strategy prompts, only pure reasoning. 🎥 Watch now → https://t.co/5WJ8iVVEHz #AI #TexasHoldem #LmgameBench
5
8
20
@haoailab
Hao AI Lab
17 days
📊 Tournament Format • Round-robin matches across 3 days • 300 chips reset daily • TrueSkill2 ranking system • Fold/call/raise ratios → style profiling We’ll see how different LLMs behave when only the game rules guide their reasoning — no hidden heuristics, just raw
1
0
3
@haoailab
Hao AI Lab
17 days
🤖 Setup We adapt the PettingZoo Texas Hold’em environment into our evaluation harness. Each model sees the hole cards, board state, stacks, pot & legal actions → chooses from {fold, call, raise, check}. 🎯 Models competing: • GPT-5 (2025-08-07) • DeepSeek-V3-2-Exp •
1
0
3
@haoailab
Hao AI Lab
17 days
[Lmgame Bench] ♠️♥️ Can LLMs bluff, fold, and bet like real poker players—with no strategic help? From Oct 28 – 30 (Tue–Thu, 10 AM – 4 PM PT), we’re hosting a 6 model live multi-agent Texas Hold’em tournament on Twitch 🎥 🕹️ https://t.co/5WJ8iVVEHz Each model starts with 300
1
5
12
@haozhangml
Hao Zhang
25 days
Strongly disagree with the original post, and agree with that Berkeley, Stanford, and UCSD actually do offer many good courses that are cutting edge and timely. For example, this Winter I offered this machine learning systems course https://t.co/mlhUais8wk at UCSD (all materials
@minilek
Jelani Nelson
26 days
At @Berkeley_EECS we always work to keep our curriculum fresh. Our intro ML course CS 189 just got a drastic makeover this semester (thanks @profjoeyg @NargesNorouzi!) and now includes ~12 lectures on e.g. Adam, PyTorch, various NN architectures, LLMs, and more (see
18
95
1K
@vllm_project
vLLM
30 days
🚀 vLLM just hit 60K GitHub stars! 🎉 From a small research idea to powering LLM inference everywhere — across NVIDIA, AMD, Intel, Apple, TPUs, and more — vLLM now supports almost all major text-generation models and native RL pipelines like TRL, Unsloth, Verl, and OpenRLHF.
11
49
491
@haoailab
Hao AI Lab
1 month
Tunix × GRL: One-Line Multi-Turn RL on JAX+TPU 📷 We’re collaborating closely with Google’s Tunix team—JAX-native LLM post-training on TPU. Using Tunix’s lightweight RL framework, we shipped a first-hand multi-turn RL training example in GRL. It runs in one line. GRL:
Tweet card summary image
github.com
Multi-Turn RL Training System with AgentTrainer for Language Model Game Reinforcement Learning - lmgame-org/GRL
1
9
31
@haoailab
Hao AI Lab
2 months
Heartfelt gratitude to @nvidia TensorRT team member and our incredible @haoailab team members @Junda_Chen_ @FuYichao123 @fuzheyu2, @Humaira__18 @zhongdongm79676 @XuJerry15689. Your dedication made this milestone possible!
1
0
2
@haoailab
Hao AI Lab
2 months
🚀 🚀 Dynasor is featured in @NVIDIA TensorRT-LLM new inference-time compute framework Scaffolding! Dynasor help cuts token usage by up to 29% with no accuracy loss! 🔍 NV Blog: https://t.co/S06S7dr4T4 Dynasor also Just accepted at #NeurIPS2025!
Tweet card summary image
github.com
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor...
1
6
25
@haoailab
Hao AI Lab
2 months
[6/N]🔥Takeaway: Token-level SD was just the beginning. Step-level Lookahead Reasoning opens the door to even faster, scalable, and powerful LLM reasoning. 👉 Blog: https://t.co/MOQh96i0mX 👉 Code: https://t.co/vfmIlpoq9P 👉 Paper:
Tweet card summary image
arxiv.org
Reasoning models excel by generating long chain-of-thoughts, but decoding the resulting thousands of tokens is slow. Token-level speculative decoding (SD) helps, but its benefit is capped, because...
0
1
6