Tianqi Chen Profile
Tianqi Chen

@tqchenml

Followers
18K
Following
3K
Media
59
Statuses
1K

AssistProf @CarnegieMellon. Chief Technologist @OctoML. Creator of @XGBoostProject, @ApacheTVM. Member https://t.co/QYyfjQNWTX, @TheASF. Views are on my own

CMU
Joined May 2015
Don't wanna be here? Send us removal request.
@tqchenml
Tianqi Chen
1 year
Exciting to share we have been working on over the past one year. MLCEngine, a universal LLM deployment engine that brings the power of server optimizations and local deploymet into a single framework, checkout platforms support šŸ‘‡ and blogpost more in a🧵
Tweet media one
7
57
249
@tqchenml
Tianqi Chen
9 days
RT @BanghuaZ: Excited to share that I’m joining NVIDIA as a Principal Research Scientist!. We’ll be joining forces on efforts in model post….
0
106
0
@tqchenml
Tianqi Chen
11 days
RT @JeffDean: Mark your calendars for #MLSys2026 in May, 2026 in Seattle. Submission deadline for papers is Oct 30 this year.
0
15
0
@tqchenml
Tianqi Chen
11 days
#MLSys2026 will be led by the general chair @luisceze and PC chairs @JiaZhihao and @achowdhery. The conference will be held in Bellevue on Seattle's east side. Consider submitting and bringing your latest works in AI and systems—more details at
@JiaZhihao
Zhihao Jia
11 days
šŸ“¢Exciting updates from #MLSys2025! All session recordings are now available and free to watch at We’re also thrilled to announce that #MLSys2026 will be held in Seattle next May—submissions open next month with a deadline of Oct 30. We look forward to
Tweet media one
0
12
57
@tqchenml
Tianqi Chen
11 days
RT @JiaZhihao: šŸ“¢Exciting updates from #MLSys2025! All session recordings are now available and free to watch at We….
0
30
0
@tqchenml
Tianqi Chen
12 days
RT @JokerEph: I’ve been starting to collaborate with the folks who are building FlashInfer: nice project and pretty amazing set of people!….
0
3
0
@tqchenml
Tianqi Chen
15 days
RT @chrisdonahuey: Excited to announce šŸŽµMagenta RealTime, the first open weights music generation model capable of real-time audio generati….
0
80
0
@tqchenml
Tianqi Chen
17 days
RT @JiaZhihao: One of the best ways to reduce LLM latency is by fusing all computation and communication into a single GPU megakernel. But….
0
120
0
@tqchenml
Tianqi Chen
20 days
RT @Xinyu2ML: šŸš€ Super excited to share Multiverse!. šŸƒ It’s been a long journey exploring the space between model design and hardware effici….
0
18
0
@tqchenml
Tianqi Chen
20 days
RT @BeidiChen: Say hello to Multiverse — the Everything Everywhere All At Once of generative modeling. šŸ’„ Lossless, adaptive, and gloriousl….
0
21
0
@tqchenml
Tianqi Chen
20 days
Check out our work on parallel reasoning 🧠; We bring an AI-assisted curator that identifies parallel paths in sequential traces, then tune models into native parallel thinkers that runs efficiently with prefix sharing and batching. Really excited about this general direction.
@InfiniAILab
Infini-AI-Lab
20 days
šŸ”„ We introduce Multiverse, a new generative modeling framework for adaptive and lossless parallel generation. šŸš€ Multiverse is the first open-source non-AR model to achieve AIME24 and AIME25 scores of 54% and 46%. 🌐 Website: 🧵 1/n
1
15
98
@tqchenml
Tianqi Chen
20 days
RT @InfiniAILab: šŸ”„ We introduce Multiverse, a new generative modeling framework for adaptive and lossless parallel generation. šŸš€ Multivers….
0
78
0
@tqchenml
Tianqi Chen
20 days
RT @lmsysorg: NVIDIAšŸ¤—SGLangšŸš€.
0
2
0
@tqchenml
Tianqi Chen
20 days
RT @NVIDIAAIDev: .@lmsysorg (SGLang) now achieves 7,583 tokens per second per GPU running @deepseek_ai R1 on the GB200 NVL72, a 2.7x leap o….
0
36
0
@tqchenml
Tianqi Chen
20 days
RT @lmsysorg: The SGLang team just ran DeepSeek 671B on NVIDIA’s GB200 NVL72, unlocking 7,583 toks/sec/GPU for decoding w/ PD disaggregatio….
0
23
0
@tqchenml
Tianqi Chen
20 days
RT @zhyncs42: SGLang is an early user of FlashInfer and witnessed its rise as the de facto LLM inference kernel library. It won best paper….
0
14
0
@tqchenml
Tianqi Chen
20 days
Checkout the technical deep dive on FlashInfer.
@NVIDIAAIDev
NVIDIA AI Developer
20 days
šŸ” Our Deep Dive Blog Covering our Winning MLSys Paper on FlashInfer Is now live āž”ļø Accelerate LLM inference with FlashInfer—NVIDIA’s high-performance, JIT-compiled library built for ultra-efficient transformer inference on GPUs. Go under the hood with
Tweet media one
0
4
28
@tqchenml
Tianqi Chen
20 days
RT @NVIDIAAIDev: šŸ” Our Deep Dive Blog Covering our Winning MLSys Paper on FlashInfer Is now live āž”ļø Accelerate LLM….
0
27
0
@tqchenml
Tianqi Chen
22 days
RT @rsalakhu: Holy cow! It has been over 10 years - no way! Feels like I was giving this tutorial just a few years ago.
0
6
0
@tqchenml
Tianqi Chen
24 days
RT @matei_zaharia: Excited to launch Agent Bricks, a new way to build auto-optimized agents on your tasks. Agent Bricks uniquely takes a *d….
0
45
0
@tqchenml
Tianqi Chen
24 days
RT @yi_xin_dong: @databricks 's Agent Bricks is powered by XGrammar for structured generation, and achieves high quality and efficiency. It….
0
4
0