Alex Chen @itisalex3 X Profile

Alex Chen

@itisalex3

Followers

7

Following

7

Media

8

Statuses

11

CS Undergrad Researcher (AI/ML) @ UCLA

Joined October 2025

Don't wanna be here? Send us removal request.

Alex Chen

@itisalex3

2 months

What happens when we compress the KV cache of prompts with multiple instructions? 🤔 Existing compression methods can lead to some instructions being ignored. 🙀 We propose simple changes to KV cache eviction that fix this problem alongside other pitfalls to be aware of. 💯

2

16

Shufan (Jack) Li

@li78658171

3 days

(1/8) We introduce Sparse-LaViDa, a new framework that accelerates the inference speed of unified multi-modal diffusion language models via a novel sparse parameterization. It achieves up to 2.8x speed up on tasks including image generation, editing, and visual math reasoning.

11

90

534

Daniel Israel

@danielmisrael

2 months

"An hour of planning can save you 10 hours of doing." ✨📝 Planned Diffusion 📝 ✨ makes a plan before parallel dLLM generation. Planned Diffusion runs 1.2-1.8× faster than autoregressive and an order of magnitude faster than diffusion, while staying within 0.9–5% AR quality.

7

47

316

Alex Chen

@itisalex3

2 months

Read the paper for more details! Done in collaboration with an awesome team: @danielmisrael, @renatogeh, and advisors @guyvdb and @adityagrover_ Project website: https://t.co/Ax42eQdOnC Paper: https://t.co/fwMpCEy088 Github:

github.com

Repository for the paper: https://arxiv.org/abs/2510.00231 - Itisalex2/pitfalls-of-kv-cache-compression

0

1

Alex Chen

@itisalex3

2 months

We propose Fair Eviction Policies: forcing each instruction to lose KV entries at equal rates. Similarly to whitelisting, fair eviction is able to lessen the degradation of defense at only a small cost to directive degradation.

1

0

Alex Chen

@itisalex3

2 months

Eviction corresponding to the wrong tokens can play a critical role in degradation. Whitelisting key defensive phrases dramatically reduces leakage with almost no cost to directive following.

1

0

Alex Chen

@itisalex3

2 months

Interestingly, when changing the order of the defense and directive, i.e. writing the system prompt with a defense prompt first (or second) and directive second (or first), the degradation pattern/eviction of directive following and leakage radically changes.

1

0

Alex Chen

@itisalex3

2 months

We use system prompt leakage as a multi-instruction case study. System prompts consist of a) defense and b) system directive. Defense prevents prompt leakage. Directive contains instructions to answer user queries. We see that leakage occurs before directive following degrades.

1

0

Alex Chen

@itisalex3

2 months

Degradation rates also depend heavily on the KV cache compression method and model. Different methods (StreamingLLM, H2O, SnapKV, K-norm, TOVA) produce completely different failure modes even at the same ratio.

1

0

Alex Chen

@itisalex3

2 months

On the IFEval dataset, different instructions can degrade at different rates. We argue that this is driven by 1) hardness and 2) eviction bias, where eviction policies can biasly evict more entries of certain instructions when compressing mult-instruction prompts.

1

0

Alex Chen

@itisalex3

2 months

KV cache compression promises memory savings, lower latency, and higher throughput, for a negligible performance cost. We argue that performance cost is poorly understood. KV Cache diagram from: https://t.co/HEM98oIsO9

1

0