Alex Chen Profile
Alex Chen

@itisalex3

Followers
7
Following
7
Media
8
Statuses
11

CS Undergrad Researcher (AI/ML) @ UCLA

Joined October 2025
Don't wanna be here? Send us removal request.
@itisalex3
Alex Chen
2 months
What happens when we compress the KV cache of prompts with multiple instructions? 🤔 Existing compression methods can lead to some instructions being ignored. 🙀 We propose simple changes to KV cache eviction that fix this problem alongside other pitfalls to be aware of. 💯
2
2
16
@li78658171
Shufan (Jack) Li
3 days
(1/8) We introduce Sparse-LaViDa, a new framework that accelerates the inference speed of unified multi-modal diffusion language models via a novel sparse parameterization. It achieves up to 2.8x speed up on tasks including image generation, editing, and visual math reasoning.
11
90
534
@danielmisrael
Daniel Israel
2 months
"An hour of planning can save you 10 hours of doing." ✨📝 Planned Diffusion 📝 ✨ makes a plan before parallel dLLM generation. Planned Diffusion runs 1.2-1.8× faster than autoregressive and an order of magnitude faster than diffusion, while staying within 0.9–5% AR quality.
7
47
316
@itisalex3
Alex Chen
2 months
Read the paper for more details! Done in collaboration with an awesome team: @danielmisrael, @renatogeh, and advisors @guyvdb and @adityagrover_ Project website: https://t.co/Ax42eQdOnC Paper: https://t.co/fwMpCEy088 Github:
Tweet card summary image
github.com
Repository for the paper: https://arxiv.org/abs/2510.00231 - Itisalex2/pitfalls-of-kv-cache-compression
0
0
1
@itisalex3
Alex Chen
2 months
We propose Fair Eviction Policies: forcing each instruction to lose KV entries at equal rates. Similarly to whitelisting, fair eviction is able to lessen the degradation of defense at only a small cost to directive degradation.
1
0
0
@itisalex3
Alex Chen
2 months
Eviction corresponding to the wrong tokens can play a critical role in degradation. Whitelisting key defensive phrases dramatically reduces leakage with almost no cost to directive following.
1
0
0
@itisalex3
Alex Chen
2 months
Interestingly, when changing the order of the defense and directive, i.e. writing the system prompt with a defense prompt first (or second) and directive second (or first), the degradation pattern/eviction of directive following and leakage radically changes.
1
0
0
@itisalex3
Alex Chen
2 months
We use system prompt leakage as a multi-instruction case study. System prompts consist of a) defense and b) system directive. Defense prevents prompt leakage. Directive contains instructions to answer user queries. We see that leakage occurs before directive following degrades.
1
0
0
@itisalex3
Alex Chen
2 months
Degradation rates also depend heavily on the KV cache compression method and model. Different methods (StreamingLLM, H2O, SnapKV, K-norm, TOVA) produce completely different failure modes even at the same ratio.
1
0
0
@itisalex3
Alex Chen
2 months
On the IFEval dataset, different instructions can degrade at different rates. We argue that this is driven by 1) hardness and 2) eviction bias, where eviction policies can biasly evict more entries of certain instructions when compressing mult-instruction prompts.
1
0
0
@itisalex3
Alex Chen
2 months
KV cache compression promises memory savings, lower latency, and higher throughput, for a negligible performance cost. We argue that performance cost is poorly understood. KV Cache diagram from: https://t.co/HEM98oIsO9
1
0
0