Tianqi Chen @tqchenml X Profile

Tianqi Chen

@tqchenml

Followers

18K

Following

3K

Media

60

Statuses

2K

AssistProf @CarnegieMellon. Distinguished Eng @NVIDIA. Creator of @XGBoostProject, @ApacheTVM. Member https://t.co/QYyfjQNp4p, @TheASF. Views are on my own

https://t.co/FGqCskDG54

CMU

Joined May 2015

Don't wanna be here? Send us removal request.

Tianqi Chen

@tqchenml

1 year

Exciting to share we have been working on over the past one year. MLCEngine, a universal LLM deployment engine that brings the power of server optimizations and local deploymet into a single framework, checkout platforms support 👇 and blogpost https://t.co/jYy1c4gTHq more in a🧵

9

60

263

Tianqi Chen

@tqchenml

5 hours

#MLSys2026 is looking for external review committee members, this is a great chance to engage with the community and contribute to the MLSys conference program

Zhihao Jia

@JiaZhihao

7 hours

#MLSys2026 is inviting self-nominations for the External Review Committee (ERC)! If you want to contribute to the review process for the MLSys conference, nominate yourself and help shape this year's program. We especially welcome PhD students and early-career researchers!

0

2

8

Zhihao Jia

@JiaZhihao

7 hours

#MLSys2026 is inviting self-nominations for the External Review Committee (ERC)! If you want to contribute to the review process for the MLSys conference, nominate yourself and help shape this year's program. We especially welcome PhD students and early-career researchers!

docs.google.com

The MLSys'26 External Review Committee (ERC) consists of experts in machine learning systems who contribute to the review process for papers submitted to the MLSys conference (https://mlsys.org)....

0

8

10

GPU MODE

@GPU_MODE

3 days

If you'd like to win your own Dell Pro Max with GB300 we're launching a new kernel competition with @NVIDIAAI @sestercegroup @Dell to optimize NVF4 kernels on B200 2025 has seen a tremendous rise of pythonic kernel DSLs, we got on-prem hardware to have reliable ncu benchmarking

7

20

164

GPU MODE

@GPU_MODE

3 days

The deadline for each kernel Kernel #1 - NVFP4 Batched GEMV: Nov 10th - Nov 28th Kernel #2 - NVFP4 GEMM: Nov 29th - Dec 19th Kernel #3 - NVFP4 Gated Dual GEMM: Dec 20th - Jan 16th Kernel #4 - NVFP4 Grouped GEMM: Jan 17th - Feb 13th

1

3

26

Tianqi Chen

@tqchenml

9 days

🧵We also personally strongly believes in this future, and think that the way we design compilers and runtime interface should reflect and amplify the trend, our recent effort on open ABI and FFI for ML systems is also one step toward this direction

github.com

Open ABI and FFI for Machine Learning Systems. Contribute to apache/tvm-ffi development by creating an account on GitHub.

0

1

8

Tianqi Chen

@tqchenml

9 days

🧵This is perhaps the most important takeaway that ML/AI mindset brings to compilers, compilers are no longer just bring a standalone executables that are self-contained, they embeds in python, produces functions that works with torch.Tensors, and ship to automotive via TensorRT

2

0

5

Tianqi Chen

@tqchenml

9 days

🧵We have done that through our MLC engine efforts and they unlocked very interesting applications such as #WebLLM. It is great to see @PyTorch compiler also embraces that philosophy. We are going to see more of it especially as hardware continue challenge our assumptions of SW.

1

5

Tianqi Chen

@tqchenml

9 days

🧵The idea is not new and they comes from the same spirit of ML infra development, bring up toolkits, make them composable with the rest, and enable pythonic development of specialized compilers for high profile needs

1

4

Tianqi Chen

@tqchenml

9 days

🧵As we start to see strong needs of AI infra needs, some of them hardly keep up with the hardware. Instead of building silver bullet for all, another alternative is to enable quick agile development of specialized compiler solutions for important application(e.g. LLM engines)

1

0

5

Tianqi Chen

@tqchenml

9 days

🧵ML/AI have always benefited on agility of development. The ability to develop models in python unlocked the rapid innovations in ML modeling

1

0

4

Tianqi Chen

@tqchenml

9 days

🧵Reflecting a bit after @PyTorch conference. ML compilers becoming "toolkits" rather than monolithic piece. Their target are also sub-modules that must interoperates with other pieces. This is THE biggest mindset difference from traditional compilers.

3

11

87

Zhihao Jia

@JiaZhihao

10 days

⏰3 days left to submit to #MLSys2026 (deadline October 30)! Submit your best ML systems work to the Research and Industrial Tracks, and join the MLSys community in Seattle next May. 👉 https://t.co/z0va9DDnJE

0

5

19

Tianqi Chen

@tqchenml

10 days

Sent your great work to #MLSys2026

Zhihao Jia

@JiaZhihao

10 days

⏰3 days left to submit to #MLSys2026 (deadline October 30)! Submit your best ML systems work to the Research and Industrial Tracks, and join the MLSys community in Seattle next May. 👉 https://t.co/z0va9DDnJE

0

1

17

Simon Guo

@simonguozirui

13 days

Wrote a 1-year retrospective with @a1zhang on KernelBench and the journey toward automated GPU/CUDA kernel generations! Since my labmates (@anneouyang, @simran_s_arora, @_williamhu) and I first started working towards this vision around last year’s @GPU_mode hackathon, we have

10

62

281

NVIDIA AI Developer

@NVIDIAAIDev

13 days

Together with the FlashInfer community, we built FlashInfer-Bench — a benchmark of real-world, AI system–driven GPU workloads — and, more importantly, an infrastructure and workflow to 0‑day ship AI‑generated kernels into production.

Tianqi Chen

@tqchenml

16 days

🚀Excited to launch FlashInfer Bench. We believe AI has the potential to help build LLM systems . To accelerate the path, we need an open schema for critical workloads and an AI-driven virtuous circle. First-class integration with FlashInfer, SGLang and vLLM support👉

0

10

49

Yuandong Tian

@tydsh

15 days

Several of my team members + myself are impacted by this layoff today. Welcome to connect :)

475

283

7K

LMSYS Org

@lmsysorg

15 days

Exciting updates on DGX Spark: Now you can run gpt-oss-20b at 70 tokens/s with SGLang! This is 1.4x faster than what we got in our blog last week. We worked with the @NVIDIAAIDev team to fix a bunch of Triton and quantization issues. Cannot wait to see how much performance we

11

19

148

Lianmin Zheng

@lm_zheng

15 days

MoE exposed interesting opportunities to fully utilize the heterogeneous hardware resources (CPU + GPU). KTransformer team is upstreaming their cool optimizations into the sglang stack to combine the best of both.

LMSYS Org

@lmsysorg

15 days

We're excited to announce the collaboration between KTransformers and SGLang! KTransformers has been a killer for local AI inference with its system-algorithm co-design, often showing 5x - 10x speedup. This integration equips SGLang with KTransformers’ inference strategy and

4

27

246

Luis Ceze

@luisceze

16 days

FlashInfer Bench’s evaluation of kernels with real-world setups will accelerate development of kernels by both humans and agents - so cool! Can’t wait to see the advances that will come out of it.

Tianqi Chen

@tqchenml

16 days

🚀Excited to launch FlashInfer Bench. We believe AI has the potential to help build LLM systems . To accelerate the path, we need an open schema for critical workloads and an AI-driven virtuous circle. First-class integration with FlashInfer, SGLang and vLLM support👉

0

5

18

Shanli Xing

@shanli_xing

16 days

Massive shout-out to the incredible team behind FlashInfer-Bench 🙌 @yi_xin_dong for the initial idea and architectural insights along the way @JiangAlexander1 & @Yiyan_Zhai for amazing work on dataset, benchmarks, and agents Huge thanks to contributors @yongwwwml @ye_combinator

0

4

18