Zhihao Jia @JiaZhihao X Profile

Zhihao Jia

@JiaZhihao

Followers

3K

Following

596

Media

22

Statuses

208

Assistant professor of Computer Science at Carnegie Mellon University. Research on systems and machine learning.

https://t.co/614YcN5ENn

Joined August 2012

Don't wanna be here? Send us removal request.

Zhihao Jia

@JiaZhihao

4 months

One of the best ways to reduce LLM latency is by fusing all computation and communication into a single GPU megakernel. But writing megakernels by hand is extremely hard. 🚀Introducing Mirage Persistent Kernel (MPK), a compiler that automatically transforms LLMs into optimized

14

127

773

Stuart Sul

@stuart_sul

20 days

(1/6) We’re happy to share that ThunderKittens now supports writing multi-GPU kernels, with the same programming model and full compatibility with PyTorch + torchrun. We’re also releasing collective ops and fused multi-GPU GEMM kernels, up to 2.6x faster than PyTorch + NCCL.

5

41

356

Tianqi Chen

@tqchenml

24 days

Checkout how speculative decoding and #XGrammar can work together to get efficient and accurate structured outputs.

github.com

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR...

1

18

123

Matei Zaharia

@matei_zaharia

1 month

I gave a keynote at @VLDBconf about why we think it's time to rethink OLTP databases with Lakebase, which combines the cloud-native Postgres design of @neondatabase with Lakehouse. Reasons include the cloud, changing demands on DWs to be more real-time, and AI agents. Slides ⬇️

2

17

77

Tianqi Chen

@tqchenml

2 months

The new semester is here at CMU, excited to co-teach with @Tim_Dettmers , to offer our fun course again on "Build Your Mini-PyTorch (needle) from scratch, then build neural networks on top". (Deep Learning Systems) Check out https://t.co/EChWdUMVud to learn more

dlsyscourse.org

Algorithms and Implementation

11

55

510

Andy Pavlo (@andypavlo.bsky.social)

@andy_pavlo

2 months

Today is the new semester for @CMUDB's Intro to Database Systems! We're going harder into material than ever before. Projects are more challenging but you can use LLMs to help. We also have 10min talks each Wed from leading DB companies. Follow on YouTube:

15445.courses.cs.cmu.edu

You want to know whether this is the premier course at Carnegie Mellon University on the design and implementation of database management systems? Well, it is. This course rips through data models...

12

120

934

Zhihao Jia

@JiaZhihao

2 months

🚀Excited to share the #MLSys Call for Papers! For the first time, we’re also welcoming submissions to the Industrial Track. Research and industrial track deadline: Oct 30, 2025 Reviews available: Jan 12, 2026 Author responses: Jan 16, 2026 Notifications: Jan 25, 2026

Minjia Zhang

@_Minjia_Zhang_

2 months

Calling industry researchers: MLSys 2026 launches its first Industrial Track! 🚀 We're excited to announce the inaugural Call for Industrial Track Papers at MLSys 2026! 🎉 👉 https://t.co/vHDTSbtJvJ) This is a unique opportunity for industry researchers and practitioners to

0

5

15

Lijie(Derrick) Yang

@LijieyYang

2 months

[1/N] 🚀 Excited to introduce my first work at @Princeton: LessIsMore – a training-free sparse attention method tailored for efficient reasoning in LRMs, achieving lossless accuracy with high sparsity up to 87.5% and 1.1x avg decoding speedup compared to Full Attention on

8

10

53

Zhihao Jia

@JiaZhihao

2 months

Surprising but true: with the right design, sparse attention can outperform full attention—while attending to far fewer tokens. 🚀

Lijie(Derrick) Yang

@LijieyYang

2 months

[1/N] 🚀 Excited to introduce my first work at @Princeton: LessIsMore – a training-free sparse attention method tailored for efficient reasoning in LRMs, achieving lossless accuracy with high sparsity up to 87.5% and 1.1x avg decoding speedup compared to Full Attention on

1

5

70

Zhihao Jia

@JiaZhihao

2 months

It’s a pleasure to contribute to this exciting direction at the intersection of GenAi and databases. Thanks for having me @pateljm.

Jignesh Patel

@pateljm

2 months

https://t.co/zldL82DlKg Excited to announce a unique partnership between @Informatica and @CMUDB to tackle current and future challenges in data exploration, understanding, cleaning, and analytics. Thanks @JiaZhihao for playing a pivotal role in this partnership. Looking forward

1

13

Jignesh Patel

@pateljm

2 months

https://t.co/zldL82DlKg Excited to announce a unique partnership between @Informatica and @CMUDB to tackle current and future challenges in data exploration, understanding, cleaning, and analytics. Thanks @JiaZhihao for playing a pivotal role in this partnership. Looking forward

na.magazine.intelligentcio.com

This month’s cover story, which can be found on p14, features Sean Kenny, Senior Vice President and Chief Information Officer, Carnival Cruise Line. He tells us how the company is partnering with DXC...

1

3

18

Song Han

@songhan_mit

3 months

NVILA is available in SGLang👏🏻

LMSYS Org

@lmsysorg

3 months

🚀Summer Fest Day 4: Turbocharging Vision-Language Models with SGLang + NVILA 4.4× throughput, 2.2× faster response time! We've integrated NVILA into SGLang, enabling high-performance, scalable serving of vision-language models. This unlocks a 4.4× TPS boost and significantly

1

11

36

Wentao Guo

@WentaoGuo7

3 months

🦆🚀QuACK🦆🚀: new SOL mem-bound kernel library without a single line of CUDA C++ all straight in Python thanks to CuTe-DSL. On H100 with 3TB/s, it performs 33%-50% faster than highly optimized libraries like PyTorch's torch.compile and Liger. 🤯 With @tedzadouri and @tri_dao

13

75

339

Reyna Abhyankar

@reyna_abhyankar

3 months

Computer-Use Agents (CUAs) are improving every day but take up to tens of minutes to complete simple tasks. We built OSWorld-Human, a benchmark that measures efficiency - a first-step towards practical CUAs. Check out our blog post!

Yiying Zhang

@yiying__zhang

3 months

Computer-use AI agents (CUAs) are powerful, but way too slow. A 2-minute human task can take a CUA over 20 minutes! At Wuklab, we're building faster CUAs. Recently, we created OSWorld-Human, a new benchmark to close the speed gap between humans and machines. Read our full blog

0

2

4

Francis Y. Yan

@FrancisYan_

3 months

🚀 [OSDI ’25, Tue 11:10am] How do you “divide and conquer” large-scale resource allocation problems like GPU cluster scheduling or WAN traffic engineering? Our answer: “decouple and decompose” the underlying optimization using DeDe. (1/3)

4

5

51

NovaSky

@NovaSkyAI

4 months

✨Release: We upgraded SkyRL into a highly-modular, performant RL framework for training LLMs. We prioritized modularity—easily prototype new algorithms, environments, and training logic with minimal overhead. 🧵👇 Blog: https://t.co/jDvM95F0Bq Code: https://t.co/CWlKue79JH

2

46

204

Anjiang Wei

@anjiangw

4 months

We introduce CodeARC, a new benchmark for evaluating LLMs’ inductive reasoning. Agents must synthesize functions from I/O examples—no natural language, just reasoning. 📄 https://t.co/j5fFbgLjQJ 💻 https://t.co/6B7Ig1M1pG 🌐 https://t.co/AQIaeVCmT7 #LLM #Reasoning #LLM4Code #ARC

3

32

94

Jeff Dean

@JeffDean

4 months

Mark your calendars for #MLSys2026 in May, 2026 in Seattle. Submission deadline for papers is Oct 30 this year.

Zhihao Jia

@JiaZhihao

4 months

📢Exciting updates from #MLSys2025! All session recordings are now available and free to watch at https://t.co/14SJxiY21g. We’re also thrilled to announce that #MLSys2026 will be held in Seattle next May—submissions open next month with a deadline of Oct 30. We look forward to

7

19

109

Tianqi Chen

@tqchenml

4 months

#MLSys2026 will be led by the general chair @luisceze and PC chairs @JiaZhihao and @achowdhery. The conference will be held in Bellevue on Seattle's east side. Consider submitting and bringing your latest works in AI and systems—more details at https://t.co/zFMHTxTXzp.

Zhihao Jia

@JiaZhihao

4 months

📢Exciting updates from #MLSys2025! All session recordings are now available and free to watch at https://t.co/14SJxiY21g. We’re also thrilled to announce that #MLSys2026 will be held in Seattle next May—submissions open next month with a deadline of Oct 30. We look forward to

0

14

64

Zhihao Jia

@JiaZhihao

4 months

📢Exciting updates from #MLSys2025! All session recordings are now available and free to watch at https://t.co/14SJxiY21g. We’re also thrilled to announce that #MLSys2026 will be held in Seattle next May—submissions open next month with a deadline of Oct 30. We look forward to

2

31

107

Zhihao Jia

@JiaZhihao

4 months

📢Exciting updates from #MLSys2025! All session recordings are now available and free to watch at https://t.co/14SJxiY21g. We’re also thrilled to announce that #MLSys2026 will be held in Seattle next May—submissions open next month with a deadline of Oct 30. We look forward to

2

31

107