Alexey Tumanov Profile
Alexey Tumanov

@alsched

Followers
548
Following
874
Media
5
Statuses
247

Assistant Professor of Computer Science @gatech_scs @gtcomputing | postdoc @Berkeley_EECS @ucbrise | ML Systems

Atlanta, GA
Joined December 2012
Don't wanna be here? Send us removal request.
@alsched
Alexey Tumanov
24 days
RT @agrawalamey12: After hitting evaluation puzzles like this in our own work, we analyzed patterns across LLM inference papers and identif….
0
3
0
@alsched
Alexey Tumanov
27 days
RT @gatech_scs: Congratulations 👏 to our faculty who were recognized on the Spring 2025 CIOS Honor Roll for their outstanding teaching and….
0
1
0
@alsched
Alexey Tumanov
4 months
RT @agrawalamey12: Super excited to share another incredible systems that we have built over the past two years! Training giant foundation….
0
13
0
@alsched
Alexey Tumanov
4 months
RT @agrawalamey12: Super long-context models with context window spanning millions of tokens are becoming commonplace (@GoogleDeepMind Gemi….
0
14
0
@alsched
Alexey Tumanov
4 months
RT @agrawalamey12: Maya offers a transparent, accurate, and efficient way to model and optimize large-scale DL training without needing exp….
Tweet card summary image
arxiv.org
Training large foundation models costs hundreds of millions of dollars, making deployment optimization critical. Current approaches require machine learning engineers to manually craft training...
0
1
0
@alsched
Alexey Tumanov
7 months
RT @agrawalamey12: Sequence pipeline parallelism being rapidly adopted for extreme long context inference in the industry! Checkout our pap….
0
4
0
@alsched
Alexey Tumanov
9 months
RT @ACMSoCC: At SoCC’24, Anastasia Ailamaki from EPFL will give a keynote on how disaggregated memory resources are becoming the norm and h….
0
1
0
@alsched
Alexey Tumanov
10 months
Super-charged technical program this year at @ACMSoCC:.Looking forward! Hope to see you there! #socc24.
@ACMSoCC
ACM SoCC
10 months
We are just under a month away from SoCC’24! This year’s conference will be from Nov 20-22 at the Microsoft Campus in Redmond, WA . Early bird registration is now open until Nov 6. Make sure to register!
0
0
4
@alsched
Alexey Tumanov
10 months
RT @agrawalamey12: ⚡ Speed Meets Accuracy:. Unlike approximation-based methods, Mnemosyne achieves exact inference—ensuring that the genera….
0
2
0
@alsched
Alexey Tumanov
10 months
RT @agrawalamey12: @Google has silently but surely developed an edge over @OpenAI. Long context processing seems to be the key to Google's….
0
4
0
@alsched
Alexey Tumanov
11 months
First publicly known support for LLM context of up to 10M tokens with high throughput & interactive production-grade TBT SLOs (30ms) with Mnemosyne. What would it take to pair program with GenAI on millions of LoC? Or analyze 10/110hrs of video/audio content? All precisely! <v>.
0
0
10
@alsched
Alexey Tumanov
11 months
Thanks, everyone who responded. We've officially concluded the ACM #sosp24 artifact evaluation process. This was a great experience, and we're eternally grateful to all the volunteer effort by the AEC reviewers.
0
0
2
@alsched
Alexey Tumanov
1 year
I'm serving as the #SOSP24 AEC Chair. We're still looking for artifact evaluation reviewers: .AE is indispensable to Systems Research and is a valuable experience. Grad students and early career researchers welcome! Exp. load: 2 artifacts. Self-nominate!.
sysartifacts.github.io
We are looking for members of the Artifact Evaluation Committee (AEC), who will contribute to SOSP’24 Artifact Evaluation (AE) process by reviewing submitted artifacts. AEC membership is especially...
2
13
22
@alsched
Alexey Tumanov
1 year
Let's set the standard for the interactive performance of LLMs capturing nuances of user experience. While latency/throughput tension is well known to the Systems community, latency jitter is less explored. Fluidity index & fluid token generation rate more aptly capture LLM perf.
@agrawalamey12
Amey Agrawal
1 year
🚀 Introducing Metron: Redefining LLM Serving Benchmarks! 📊. Tired of misleading metrics for LLM performance? Our new paper introduces a holistic framework that captures what really matters - the user experience! 🧠💬. #LLM #AI #Benchmark.
0
0
6
@alsched
Alexey Tumanov
1 year
Really proud of my PhD student's work on developing the new mechanism and policy that significantly improves tail latency performance in Large Language Model (LLM) inference without sacrificing throughput. Already received 10+ citations, source is OSS and adopted in the industry.
@agrawalamey12
Amey Agrawal
1 year
Did you ever feel that @chatgpt is done generating your response and then suddenly a burst of tokens show up? This happens when the serving system is prioritizing someone else’s request before generating your response. But why? well to reduce cost. 🧵.
0
0
8