Zihao Ye Profile
Zihao Ye

@ye_combinator

Followers
2K
Following
2K
Media
16
Statuses
134

Proud to be an engineer. I'm building flashinfer (https://t.co/PabCM3l09l)

Seattle
Joined October 2017
Don't wanna be here? Send us removal request.
@ye_combinator
Zihao Ye
10 days
RT @JokerEph: I’ve been starting to collaborate with the folks who are building FlashInfer: nice project and pretty amazing set of people!….
0
3
0
@ye_combinator
Zihao Ye
19 days
RT @zhyncs42: SGLang is an early user of FlashInfer and witnessed its rise as the de facto LLM inference kernel library. It won best paper….
0
14
0
@ye_combinator
Zihao Ye
19 days
RT @NVIDIAAIDev: 🔍 Our Deep Dive Blog Covering our Winning MLSys Paper on FlashInfer Is now live ➡️ Accelerate LLM….
0
27
0
@ye_combinator
Zihao Ye
19 days
RT @InfiniAILab: 🔥 We introduce Multiverse, a new generative modeling framework for adaptive and lossless parallel generation. 🚀 Multivers….
0
78
0
@ye_combinator
Zihao Ye
28 days
RT @GPU_MODE: Been excited about this talk for a while, @SonglinYang4 on efficient architecture! . Just started!.
0
27
0
@ye_combinator
Zihao Ye
28 days
RT @__tensorcore__: Another 🔥 blog about CUTLASS from @colfaxintl, this time focusing on the gory details of block-scaled MXFP and NVFP dat….
0
35
0
@ye_combinator
Zihao Ye
30 days
RT @HanGuo97: We know Attention and its linear-time variants, such as linear attention and State Space Models. But what lies in between?. I….
0
191
0
@ye_combinator
Zihao Ye
1 month
RT @xieenze_jr: 🚀 Fast-dLLM: 27.6× Faster Diffusion LLMs with KV Cache & Parallel Decoding 💥 . Key Features🌟 .- Block-Wise KV Cache . R….
0
34
0
@ye_combinator
Zihao Ye
2 months
RT @tri_dao: I love Cutlass, and this new Python DSL looks very well-designed. Will for sure accelerate kernel dev + exploring new ideas in….
0
25
0
@ye_combinator
Zihao Ye
2 months
RT @NVIDIAHPCDev: 🎉CUTLASS 4.0 is here-bringing native #Python support for device-side kernel design, for ops like GEMM, Flash Attention, a….
0
36
0
@ye_combinator
Zihao Ye
2 months
RT @__tensorcore__: 🚨🔥 CUTLASS 4.0 is released 🔥🚨. pip install nvidia-cutlass-dsl. 4.0 marks a major shift for CUTLASS: towards native GPU….
0
82
0
@ye_combinator
Zihao Ye
2 months
We’re thrilled that FlashInfer won a Best Paper Award at MLSys 2025! 🎉.This wouldn’t have been possible without the community — huge thanks to @lmsysorg’s sglang for deep co-design (which is crtical for inference kernel evolution) and stress-testing over the years, and to.
@NVIDIAAIDev
NVIDIA AI Developer
2 months
🎉 Congratulations to the FlashInfer team – their technical paper, "FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving," just won best paper at #MLSys2025. 🏆. 🙌 We are excited to share that we are now backing FlashInfer – a supporter and.
15
37
230
@ye_combinator
Zihao Ye
2 months
RT @JoyChew_d: Super excited to release FlexAttention for Inference with a decoding backend, GQA, PagedAttention, trainable bias and more!….
0
7
0
@ye_combinator
Zihao Ye
2 months
RT @tqchenml: If you are around in the Bay Area, make sure to attend the #MLSys2025 keynote tomorrow by @soumithchintala at the Santa Clara….
0
8
0
@ye_combinator
Zihao Ye
2 months
RT @yi_xin_dong: We are hosting a happy hour with @lmsysorg at #mlsys2025! Join us for engaging talks on SGLang, the structured generation….
0
12
0
@ye_combinator
Zihao Ye
2 months
RT @NovaSkyAI: 1/N Introducing SkyRL-v0, our RL training pipeline enabling efficient RL training for long-horizon, real-environment tasks l….
0
70
0
@ye_combinator
Zihao Ye
2 months
RT @DeeplyIgnorant: 🚀 We released Triton-distributed! 🌟.Build compute-comm. overlapping kernels for GPUs—performance rivals optimized libra….
0
10
0
@ye_combinator
Zihao Ye
3 months
RT @hyhieu226: Their content always comes out in great quantity and quality ❤️.
0
19
0
@ye_combinator
Zihao Ye
3 months
RT @abcdabcd987: Lower latency and Higher throughput -- Get both with multi-node deployment for MoE models like DeepSeek-V3/R1.
0
8
0
@ye_combinator
Zihao Ye
3 months
RT @Tim_Dettmers: Happy to announce that I joined the CMU Catalyst with three of my incoming students. Our research will bring the best m….
0
57
0