Zhihao Jia Profile
Zhihao Jia

@JiaZhihao

Followers
3K
Following
596
Media
22
Statuses
208

Assistant professor of Computer Science at Carnegie Mellon University. Research on systems and machine learning.

Joined August 2012
Don't wanna be here? Send us removal request.
@JiaZhihao
Zhihao Jia
4 months
One of the best ways to reduce LLM latency is by fusing all computation and communication into a single GPU megakernel. But writing megakernels by hand is extremely hard. 🚀Introducing Mirage Persistent Kernel (MPK), a compiler that automatically transforms LLMs into optimized
14
127
773
@stuart_sul
Stuart Sul
20 days
(1/6) We’re happy to share that ThunderKittens now supports writing multi-GPU kernels, with the same programming model and full compatibility with PyTorch + torchrun. We’re also releasing collective ops and fused multi-GPU GEMM kernels, up to 2.6x faster than PyTorch + NCCL.
5
41
356
@matei_zaharia
Matei Zaharia
1 month
I gave a keynote at @VLDBconf about why we think it's time to rethink OLTP databases with Lakebase, which combines the cloud-native Postgres design of @neondatabase with Lakehouse. Reasons include the cloud, changing demands on DWs to be more real-time, and AI agents. Slides ⬇️
2
17
77
@tqchenml
Tianqi Chen
2 months
The new semester is here at CMU, excited to co-teach with @Tim_Dettmers , to offer our fun course again on "Build Your Mini-PyTorch (needle) from scratch, then build neural networks on top". (Deep Learning Systems) Check out https://t.co/EChWdUMVud to learn more
Tweet card summary image
dlsyscourse.org
Algorithms and Implementation
11
55
510
@andy_pavlo
Andy Pavlo (@andypavlo.bsky.social)
2 months
Today is the new semester for @CMUDB's Intro to Database Systems! We're going harder into material than ever before. Projects are more challenging but you can use LLMs to help. We also have 10min talks each Wed from leading DB companies. Follow on YouTube:
Tweet card summary image
15445.courses.cs.cmu.edu
You want to know whether this is the premier course at Carnegie Mellon University on the design and implementation of database management systems? Well, it is. This course rips through data models...
12
120
934
@JiaZhihao
Zhihao Jia
2 months
🚀Excited to share the #MLSys Call for Papers! For the first time, we’re also welcoming submissions to the Industrial Track. Research and industrial track deadline: Oct 30, 2025 Reviews available: Jan 12, 2026 Author responses: Jan 16, 2026 Notifications: Jan 25, 2026
@_Minjia_Zhang_
Minjia Zhang
2 months
Calling industry researchers: MLSys 2026 launches its first Industrial Track! 🚀 We're excited to announce the inaugural Call for Industrial Track Papers at MLSys 2026! 🎉 👉 https://t.co/vHDTSbtJvJ) This is a unique opportunity for industry researchers and practitioners to
0
5
15
@LijieyYang
Lijie(Derrick) Yang
2 months
[1/N] 🚀 Excited to introduce my first work at @Princeton: LessIsMore – a training-free sparse attention method tailored for efficient reasoning in LRMs, achieving lossless accuracy with high sparsity up to 87.5% and 1.1x avg decoding speedup compared to Full Attention on
8
10
53
@JiaZhihao
Zhihao Jia
2 months
Surprising but true: with the right design, sparse attention can outperform full attention—while attending to far fewer tokens. 🚀
@LijieyYang
Lijie(Derrick) Yang
2 months
[1/N] 🚀 Excited to introduce my first work at @Princeton: LessIsMore – a training-free sparse attention method tailored for efficient reasoning in LRMs, achieving lossless accuracy with high sparsity up to 87.5% and 1.1x avg decoding speedup compared to Full Attention on
1
5
70
@JiaZhihao
Zhihao Jia
2 months
It’s a pleasure to contribute to this exciting direction at the intersection of GenAi and databases. Thanks for having me @pateljm.
@pateljm
Jignesh Patel
2 months
https://t.co/zldL82DlKg Excited to announce a unique partnership between @Informatica and @CMUDB to tackle current and future challenges in data exploration, understanding, cleaning, and analytics. Thanks @JiaZhihao for playing a pivotal role in this partnership. Looking forward
1
1
13
@pateljm
Jignesh Patel
2 months
https://t.co/zldL82DlKg Excited to announce a unique partnership between @Informatica and @CMUDB to tackle current and future challenges in data exploration, understanding, cleaning, and analytics. Thanks @JiaZhihao for playing a pivotal role in this partnership. Looking forward
Tweet card summary image
na.magazine.intelligentcio.com
This month’s cover story, which can be found on p14, features Sean Kenny, Senior Vice President and Chief Information Officer, Carnival Cruise Line. He tells us how the company is partnering with DXC...
1
3
18
@songhan_mit
Song Han
3 months
NVILA is available in SGLang👏🏻
@lmsysorg
LMSYS Org
3 months
🚀Summer Fest Day 4: Turbocharging Vision-Language Models with SGLang + NVILA 4.4× throughput, 2.2× faster response time! We've integrated NVILA into SGLang, enabling high-performance, scalable serving of vision-language models. This unlocks a 4.4× TPS boost and significantly
1
11
36
@WentaoGuo7
Wentao Guo
3 months
🦆🚀QuACK🦆🚀: new SOL mem-bound kernel library without a single line of CUDA C++ all straight in Python thanks to CuTe-DSL. On H100 with 3TB/s, it performs 33%-50% faster than highly optimized libraries like PyTorch's torch.compile and Liger. 🤯 With @tedzadouri and @tri_dao
13
75
339
@reyna_abhyankar
Reyna Abhyankar
3 months
Computer-Use Agents (CUAs) are improving every day but take up to tens of minutes to complete simple tasks. We built OSWorld-Human, a benchmark that measures efficiency - a first-step towards practical CUAs. Check out our blog post!
@yiying__zhang
Yiying Zhang
3 months
Computer-use AI agents (CUAs) are powerful, but way too slow. A 2-minute human task can take a CUA over 20 minutes! At Wuklab, we're building faster CUAs. Recently, we created OSWorld-Human, a new benchmark to close the speed gap between humans and machines. Read our full blog
0
2
4
@FrancisYan_
Francis Y. Yan
3 months
🚀 [OSDI ’25, Tue 11:10am] How do you “divide and conquer” large-scale resource allocation problems like GPU cluster scheduling or WAN traffic engineering? Our answer: “decouple and decompose” the underlying optimization using DeDe. (1/3)
4
5
51
@NovaSkyAI
NovaSky
4 months
✨Release: We upgraded SkyRL into a highly-modular, performant RL framework for training LLMs. We prioritized modularity—easily prototype new algorithms, environments, and training logic with minimal overhead. 🧵👇 Blog: https://t.co/jDvM95F0Bq Code: https://t.co/CWlKue79JH
2
46
204
@anjiangw
Anjiang Wei
4 months
We introduce CodeARC, a new benchmark for evaluating LLMs’ inductive reasoning. Agents must synthesize functions from I/O examples—no natural language, just reasoning. 📄 https://t.co/j5fFbgLjQJ 💻 https://t.co/6B7Ig1M1pG 🌐 https://t.co/AQIaeVCmT7 #LLM #Reasoning #LLM4Code #ARC
3
32
94
@JeffDean
Jeff Dean
4 months
Mark your calendars for #MLSys2026 in May, 2026 in Seattle. Submission deadline for papers is Oct 30 this year.
@JiaZhihao
Zhihao Jia
4 months
📢Exciting updates from #MLSys2025! All session recordings are now available and free to watch at https://t.co/14SJxiY21g. We’re also thrilled to announce that #MLSys2026 will be held in Seattle next May—submissions open next month with a deadline of Oct 30. We look forward to
7
19
109
@tqchenml
Tianqi Chen
4 months
#MLSys2026 will be led by the general chair @luisceze and PC chairs @JiaZhihao and @achowdhery. The conference will be held in Bellevue on Seattle's east side. Consider submitting and bringing your latest works in AI and systems—more details at https://t.co/zFMHTxTXzp.
@JiaZhihao
Zhihao Jia
4 months
📢Exciting updates from #MLSys2025! All session recordings are now available and free to watch at https://t.co/14SJxiY21g. We’re also thrilled to announce that #MLSys2026 will be held in Seattle next May—submissions open next month with a deadline of Oct 30. We look forward to
0
14
64
@JiaZhihao
Zhihao Jia
4 months
📢Exciting updates from #MLSys2025! All session recordings are now available and free to watch at https://t.co/14SJxiY21g. We’re also thrilled to announce that #MLSys2026 will be held in Seattle next May—submissions open next month with a deadline of Oct 30. We look forward to
2
31
107
@JiaZhihao
Zhihao Jia
4 months
📢Exciting updates from #MLSys2025! All session recordings are now available and free to watch at https://t.co/14SJxiY21g. We’re also thrilled to announce that #MLSys2026 will be held in Seattle next May—submissions open next month with a deadline of Oct 30. We look forward to
2
31
107