Tianqi Chen
@tqchenml
Followers
18K
Following
3K
Media
60
Statuses
2K
AssistProf @CarnegieMellon. Distinguished Eng @NVIDIA. Creator of @XGBoostProject, @ApacheTVM. Member https://t.co/QYyfjQNp4p, @TheASF. Views are on my own
CMU
Joined May 2015
Exciting to share we have been working on over the past one year. MLCEngine, a universal LLM deployment engine that brings the power of server optimizations and local deploymet into a single framework, checkout platforms support 👇 and blogpost https://t.co/jYy1c4gTHq more in a🧵
9
60
263
#MLSys2026 is looking for external review committee members, this is a great chance to engage with the community and contribute to the MLSys conference program
#MLSys2026 is inviting self-nominations for the External Review Committee (ERC)! If you want to contribute to the review process for the MLSys conference, nominate yourself and help shape this year's program. We especially welcome PhD students and early-career researchers!
0
2
8
#MLSys2026 is inviting self-nominations for the External Review Committee (ERC)! If you want to contribute to the review process for the MLSys conference, nominate yourself and help shape this year's program. We especially welcome PhD students and early-career researchers!
docs.google.com
The MLSys'26 External Review Committee (ERC) consists of experts in machine learning systems who contribute to the review process for papers submitted to the MLSys conference (https://mlsys.org)....
0
8
10
If you'd like to win your own Dell Pro Max with GB300 we're launching a new kernel competition with @NVIDIAAI @sestercegroup @Dell to optimize NVF4 kernels on B200 2025 has seen a tremendous rise of pythonic kernel DSLs, we got on-prem hardware to have reliable ncu benchmarking
7
20
164
đź§µWe also personally strongly believes in this future, and think that the way we design compilers and runtime interface should reflect and amplify the trend, our recent effort on open ABI and FFI for ML systems is also one step toward this direction
github.com
Open ABI and FFI for Machine Learning Systems. Contribute to apache/tvm-ffi development by creating an account on GitHub.
0
1
8
đź§µThis is perhaps the most important takeaway that ML/AI mindset brings to compilers, compilers are no longer just bring a standalone executables that are self-contained, they embeds in python, produces functions that works with torch.Tensors, and ship to automotive via TensorRT
2
0
5
đź§µThe idea is not new and they comes from the same spirit of ML infra development, bring up toolkits, make them composable with the rest, and enable pythonic development of specialized compilers for high profile needs
1
1
4
đź§µAs we start to see strong needs of AI infra needs, some of them hardly keep up with the hardware. Instead of building silver bullet for all, another alternative is to enable quick agile development of specialized compiler solutions for important application(e.g. LLM engines)
1
0
5
đź§µML/AI have always benefited on agility of development. The ability to develop models in python unlocked the rapid innovations in ML modeling
1
0
4
⏰3 days left to submit to #MLSys2026 (deadline October 30)! Submit your best ML systems work to the Research and Industrial Tracks, and join the MLSys community in Seattle next May. 👉 https://t.co/z0va9DDnJE
0
5
19
Sent your great work to #MLSys2026
⏰3 days left to submit to #MLSys2026 (deadline October 30)! Submit your best ML systems work to the Research and Industrial Tracks, and join the MLSys community in Seattle next May. 👉 https://t.co/z0va9DDnJE
0
1
17
Wrote a 1-year retrospective with @a1zhang on KernelBench and the journey toward automated GPU/CUDA kernel generations! Since my labmates (@anneouyang, @simran_s_arora, @_williamhu) and I first started working towards this vision around last year’s @GPU_mode hackathon, we have
10
62
281
Together with the FlashInfer community, we built FlashInfer-Bench — a benchmark of real-world, AI system–driven GPU workloads — and, more importantly, an infrastructure and workflow to 0‑day ship AI‑generated kernels into production.
🚀Excited to launch FlashInfer Bench. We believe AI has the potential to help build LLM systems . To accelerate the path, we need an open schema for critical workloads and an AI-driven virtuous circle. First-class integration with FlashInfer, SGLang and vLLM support👉
0
10
49
Several of my team members + myself are impacted by this layoff today. Welcome to connect :)
475
283
7K
Exciting updates on DGX Spark: Now you can run gpt-oss-20b at 70 tokens/s with SGLang! This is 1.4x faster than what we got in our blog last week. We worked with the @NVIDIAAIDev team to fix a bunch of Triton and quantization issues. Cannot wait to see how much performance we
11
19
148
MoE exposed interesting opportunities to fully utilize the heterogeneous hardware resources (CPU + GPU). KTransformer team is upstreaming their cool optimizations into the sglang stack to combine the best of both.
We're excited to announce the collaboration between KTransformers and SGLang! KTransformers has been a killer for local AI inference with its system-algorithm co-design, often showing 5x - 10x speedup. This integration equips SGLang with KTransformers’ inference strategy and
4
27
246
FlashInfer Bench’s evaluation of kernels with real-world setups will accelerate development of kernels by both humans and agents - so cool! Can’t wait to see the advances that will come out of it.
🚀Excited to launch FlashInfer Bench. We believe AI has the potential to help build LLM systems . To accelerate the path, we need an open schema for critical workloads and an AI-driven virtuous circle. First-class integration with FlashInfer, SGLang and vLLM support👉
0
5
18
Massive shout-out to the incredible team behind FlashInfer-Bench 🙌 @yi_xin_dong for the initial idea and architectural insights along the way @JiangAlexander1 & @Yiyan_Zhai for amazing work on dataset, benchmarks, and agents Huge thanks to contributors @yongwwwml @ye_combinator
0
4
18