Siddhant Ray Profile
Siddhant Ray

@siddhantrayyy

Followers
131
Following
59
Media
2
Statuses
117

Mainly Networks and Systems. Some Machine Learning. PhD CS candidate @ UChicago. MSc. EEIT @ ETH Zurich. BTech. ECE @ VIT Vellore.

Joined March 2016
Don't wanna be here? Send us removal request.
@siddhantrayyy
Siddhant Ray
25 days
RT @lmcache: LMCache supports gpt-oss (20B/120B) on Day 1!. TTFT 1.20s โ†’ 0.39s (-67.5%), finish time 15.70s โ†’ 7.73s (-50.7%) compared to Vaโ€ฆ.
0
9
0
@siddhantrayyy
Siddhant Ray
1 month
RT @lmcache: Everyone is focused on faster LLM inference engines. But bigger potentials might be reached with what is beyond the engine. ๐Ÿš€โ€ฆ.
0
14
0
@grok
Grok
21 days
Generate videos in just a few seconds. Try Grok Imagine, free for a limited time.
479
851
4K
@siddhantrayyy
Siddhant Ray
2 months
RT @YichuanM: Big yes to this!.
0
1
0
@siddhantrayyy
Siddhant Ray
2 months
This is joint work carried out between.@UChicago/@lmcache( @astrogu_ , @this_will_echo , @shaoting_feng , @JunchenJiang ), @Princeton ( @ruipeterpan , Ravi Netravali ) and @microsoft (Ganesh Ananthanarayanan).
0
0
0
@siddhantrayyy
Siddhant Ray
2 months
With RAG and agents becoming ubiquitous in LLM systems, tuning quality and performance JOINTLY is essential to achieve the best LLM quality-of-experience. Our paper at SOSP this year, addresses this exact tradeoff!๐Ÿ”ฅ
Tweet media one
1
6
16
@siddhantrayyy
Siddhant Ray
2 months
RT @this_will_echo: ๐Ÿคฏ Believe it or not, even when an LLM generates just ONE SINGLE word, it can still be powerful!. Say in recommendation:โ€ฆ.
0
7
0
@siddhantrayyy
Siddhant Ray
2 months
RT @lmcache: ๐—Ÿ๐— ๐—–๐—ฎ๐—ฐ๐—ต๐—ฒ ๐—ฟ๐—ฒ๐—ฎ๐—ฐ๐—ต๐—ฒ๐˜€ ๐Ÿฎ,๐Ÿฌ๐Ÿฌ๐Ÿฌ+ ๐˜€๐˜๐—ฎ๐—ฟ๐˜€ ๐—ผ๐—ป ๐—š๐—ถ๐˜๐—›๐˜‚๐—ฏ! ๐ŸŒŸ . A huge thank you to our open-source communityโ€”your support is fueling nextโ€‘gen effโ€ฆ.
0
5
0
@siddhantrayyy
Siddhant Ray
2 months
RT @AlwaysBiMySide: Even NVIDIA Dynamo thinks that letting LLM do prefill only is useful! Just sayin'. Our PrefillOnly paper mightโ€™ve beenโ€ฆ.
0
3
0
@siddhantrayyy
Siddhant Ray
3 months
RT @lmcache: ๐Ÿš€ LMCache X @RedHat Official Collaboration. LMCache is now a founding supporter of Red Hat's new llm-d project for scalable diโ€ฆ.
0
13
0
@siddhantrayyy
Siddhant Ray
3 months
RT @fxgst: Are you going to the World Computer Summit next week in Zurich? ๐ŸŒŽ.Then donโ€™t miss the demo of ICP Ninja! ๐Ÿฅท.Sign up here: https:/โ€ฆ.
0
6
0
@siddhantrayyy
Siddhant Ray
3 months
RT @RedHat_AI: LLM inference is too slow, too expensive, and too hard to scale. ๐Ÿšจ Introducing llm-d, a Kubernetes-native distributed inferโ€ฆ.
0
89
0
@siddhantrayyy
Siddhant Ray
4 months
RT @lmcache: ๐Ÿš€ LMCache turbocharges vLLM, KServe & Dynamo!. Our new blog reveals how this SOTA KV cache layer slashes LLM inference costs &โ€ฆ.
0
2
0
@siddhantrayyy
Siddhant Ray
4 months
RT @lmcache: ๐Ÿš€๐— ๐—ผ๐—ผ๐—ป๐—ฐ๐—ฎ๐—ฐ๐—ธ๐—ฒ X ๐—Ÿ๐— ๐—–๐—ฎ๐—ฐ๐—ต๐—ฒ: KV Cache-centric Language Model Serving ๐Ÿš€. We're thrilled to announce a strategic collaboration betweenโ€ฆ.
0
10
0
@siddhantrayyy
Siddhant Ray
4 months
RT @lmcache: ๐Ÿš€ ๐—ง๐—ฒ๐—ป๐—ฐ๐—ฒ๐—ป๐˜ x ๐—Ÿ๐— ๐—–๐—ฎ๐—ฐ๐—ต๐—ฒ Collaboration: Integrating ๐— ๐—ผ๐—ผ๐—ป๐—ฐ๐—ฎ๐—ธ๐—ฒ Store for Enhanced LLM Inference Caching! ๐Ÿฅฎ๐Ÿฅฎ. Excited to share insightโ€ฆ.
0
9
0
@siddhantrayyy
Siddhant Ray
5 months
RT @lmcache: ๐Ÿš€ ๐—Ÿ๐— ๐—–๐—ฎ๐—ฐ๐—ต๐—ฒ Powers Up ๐˜ƒ๐—Ÿ๐—Ÿ๐—  ๐—ฉ๐Ÿญ: P/D Disaggregation & NIXL Support!. vLLM V1 revolutionized LLM serving, but lacked a dedicated KVโ€ฆ.
0
11
0
@siddhantrayyy
Siddhant Ray
5 months
RT @lmcache: ๐Ÿ† Exciting news from #EuroSys2025: Our work on CacheBlend won Best Paper! ๐Ÿš€. CacheBlend delivers the first-ever speedup for RAโ€ฆ.
0
8
0
@siddhantrayyy
Siddhant Ray
6 months
RT @JunchenJiang: ๐Ÿ”ฅ๐Ÿ”ฅLOTS of papers on improving LLM prefill, but they RARELY become standard in the industry and open-source community. Giโ€ฆ.
Tweet card summary image
github.com
vLLMโ€™s reference system for K8S-native cluster-wide deployment with community-driven performance optimization - vllm-project/production-stack
0
1
0
@siddhantrayyy
Siddhant Ray
6 months
Amazing effort, please check it out!.
@lmcache
LMCache Lab
6 months
๐Ÿš€ We're thrilled to announce vLLM Production Stackโ€”an open-source, Enterprise-Grade LLM inference solution that is now an official first-party ecosystem project under vLLM!. Why does this matter?.A handful of companies focus on LLM training, but millions of apps and businesses
Tweet media one
0
0
1
@siddhantrayyy
Siddhant Ray
6 months
RT @lmcache: ๐Ÿš€ Deploy your efficient LLM inference cluster on AWS & GCP in one command with Production-Stack! . Check out the Blog (httpโ€ฆ.
0
3
0
@siddhantrayyy
Siddhant Ray
7 months
RT @lmcache: ๐Ÿš€ Deploying LLMs in Clusters #1. Check out this step-by-step tutorial to deploy the vLLM Production Stack on a cloud VM for sโ€ฆ.
0
7
0