Baris Kasikci @bariskasikci X Profile

Baris Kasikci

@bariskasikci

Followers

5K

Following

11K

Media

152

Statuses

4K

Professor @uwcse; previously Morris Wellman Professor of EECS @UMichCSE, @Google, @MSFTResearch

https://t.co/yhDauO0wsW

Seattle, WA

Joined March 2010

Don't wanna be here? Send us removal request.

Murat Demirbas (Distributolog)

@muratdemirbas

8 days

Overload control is usually built around a bad assumption. Most systems watch global signals like queue length or tail latency and react at the front door by throttling new arrivals or dropping random requests. This works when CPU or network is the bottleneck. It fails when the

2

5

35

Baris Kasikci

@bariskasikci

24 days

🚀 Join us at the Paul G. Allen School of Computer Science & Engineering at The University of Washington! We’re hiring tenure-track faculty at all ranks across computer science and computer engineering. 📢 You’ll engage with outstanding, motivated students and colleagues in one

1

27

180

Keisuke Kamahori

@KeisukeKamahori

30 days

I will be attending #EMNLP2025 this week to present LiteASR, a compression method for speech encoders (a collaborative work with @kotoba_tech). Catch our poster at the first poster session on Wednesday morning. Happy to chat about efficiency, speech, or both!

Baris Kasikci

@bariskasikci

3 months

🚀 Presenting LiteASR: a method that halves the compute cost of speech encoders by 2x, leveraging low-rank approximation of activations. LiteASR is accepted to #EMNLP2025 (main) @emnlpmeeting

1

3

10

Tim Althoff

@timalthoff

1 month

(please reshare) I'm recruiting multiple PhD students and Postdocs @uwcse @uwnlp ( https://t.co/I5wQsFnCLL). Focus areas incl. psychosocial AI simulation and safety, Human-AI collaboration. PhD: https://t.co/ku40wCrpYh Postdocs: https://t.co/K9HUIPJ5h6

7

111

402

Isil Dillig

@IsilDillig

1 month

UTCS is hiring in all areas, including PL! Please DM me if you are on the job market this year and interested in joining our wonderful department :)

3

27

81

Baris Kasikci

@bariskasikci

1 month

Can we please build a printer that works reliably and doesn’t jam before we get to Skynet?

Max Tegmark

@tegmark

1 month

A stunningly broad coalition has come out against Skynet: AI researchers, faith leaders, business pioneers, policymakers, NatSec folks and actors stand together, from Bannon & Beck to Hinton, Wozniak & Prince Harry. We stand together because we want a human future.

1

0

8

Shanli Xing

@shanli_xing

1 month

🤔 Can AI optimize the systems it runs on? 🚀 Introducing FlashInfer-Bench, a workflow that makes AI systems self-improving with agents: - Standardized signature for LLM serving kernels - Implement kernels with your preferred language - Benchmark them against real-world serving

3

44

145

Matei Zaharia

@matei_zaharia

2 months

My team is hiring AI research interns for summer 2026 at Databricks! Join us to learn about AI use cases at thousands of companies, and contribute to making it easier for anyone to build specialized AI agents and models for difficult tasks.

20

55

615

Baris Kasikci

@bariskasikci

2 months

LLMc is open-source ( https://t.co/OSqM6Q2mMX)! We’re excited to see the community build on it. Try it out and let us know what you think! (4/4) P.S. props to @cHHillee and other great folks at @thinkymachines for working on deterministic LLM inference, which made LLMc possible

github.com

A language-model–powered compressor for natural language text - uw-syfi/LLMc

1

0

8

Baris Kasikci

@bariskasikci

2 months

Benchmarks show LLMc achieves state-of-the-art compression ratios, outperforming Gzip and LZMA on natural language text. To manage the quadratic complexity of LLM inference, it processes text in chunks, improving performance and GPU utilization. (3/4)

1

0

1

Baris Kasikci

@bariskasikci

2 months

The connection between LLMs and compression is strong: a model that accurately predicts the next token is an optimal compressor. LLMc uses this principle with rank-based encoding, storing a token’s rank in the LLM’s output distribution instead of the token itself for a compact

1

0

1

Baris Kasikci

@bariskasikci

2 months

How to beat all compression using LLMs? ⚙️ Introducing LLMc — a lossless compressor built with LLMs. LLMc leverages the predictive power of LLMs to beat traditional compressors like Gzip and LZMA on natural language text. (1/4) 🔗 Blog Post: https://t.co/5ppAqBSTTh 💻 Code:

2

5

22

Baris Kasikci

@bariskasikci

2 months

VoxServe is open source and already supports models like CSM, Orpheus, Zonos, GLM-Voice, Step-Audio-2, and more are coming. Try it via `pip install vox-serve`, and we’d love to hear your feedback! (4/4) https://t.co/3uBRHE9K64

github.com

Serving System for SpeechLMs. Contribute to vox-serve/vox-serve development by creating an account on GitHub.

0

1

Baris Kasikci

@bariskasikci

2 months

VoxServe also introduces a scheduling algorithm for various scenarios, optimizing for performance metrics that really matter: For online settings, it minimizes Time-To-First-Audio latency while satisfying streaming needs. For offline settings, it optimizes end-to-end throughput.

1

0

1

Baris Kasikci

@bariskasikci

2 months

SpeechLMs pose unique deployment challenges: you need to run a language model + audio detokenizer in concert with careful scheduling, stream audio in real time, and support very different model architectures. VoxServe unifies all these under a consistent abstraction while

1

0

Baris Kasikci

@bariskasikci

2 months

🎙️ Introducing VoxServe — a high-throughput, low-latency serving system built for Speech Language Models (TTS, STS, etc.), natively handling audio detokenization + streaming with performance as the core goal. (1/4) 🔗 blog post: https://t.co/vPEwN8Q5XQ 💻 code:

1

4

11

Tianyin Xu

@tianyin_xu

2 months

We are opening a new blog series at @ACMSIGOPS Blog to discuss Systems Research in the era of disruptive AI. If you'd like to share thoughts, viewpoints, and stories, please consider contributing an article! My hope is that, through the exposure and discussion, we can help

Haoran Qiu

@haoran_qiu98

2 months

Read the full post and join the conversation! 👉 https://t.co/ckLum4Pr8O Together with Mike Liang, @FrancisYan_, @tianyin_xu, Lidong Zhou

5

16

70

Baris Kasikci

@bariskasikci

3 months

Hayroll wraps C2Rust so that C macros are translated into Rust `macro_rules` or functions. See the image for an example. Hayroll is designed to wrap any C-to-Rust translation tool, but we have not yet tested that capability. You can find Hayroll at https://t.co/pYH6QKiWgC. Please

0

4

Baris Kasikci

@bariskasikci

3 months

C to Rust translators like C2Rust do not handle C macros. @hrhrpeng built a tool called Hayroll that tackles this problem.

1

9

Martin Maas

@martin_maas

3 months

Excited to share the CFP for the inaugural MLSys industry track. Timeline and format are the same, but industry track papers focus on the design and/or evaluation of real world systems. Novelty is not a requirement. The deadline is October 30, 2025:

1

2

8