Piotr Nawrot Profile
Piotr Nawrot

@p_nawrot

Followers
8K
Following
951
Media
32
Statuses
391

LLM Efficiency PhD @ Edinburgh | 🥇🥈 @ Flunkyball Polish Championships | 🥇 @ Jerry Hunter Pub's Bowling Tournament | 50000 🏆 & Legendary II @ Brawl Stars

Warsaw
Joined July 2014
Don't wanna be here? Send us removal request.
@p_nawrot
Piotr Nawrot
4 months
Sparse attention is one of the most promising strategies to unlock long-context processing and long generation reasoning in LLMs. We performed the most comprehensive study on training-free sparse attention to date. Here is what we found:
Tweet media one
11
112
649
@p_nawrot
Piotr Nawrot
10 hours
RT @fchollet: I'll take the other side of this bet. .
0
76
0
@grok
Grok
6 days
What do you want to know?.
433
268
2K
@p_nawrot
Piotr Nawrot
15 days
RT @NiJinjie: Token crisis: solved. ✅. We pre-trained diffusion language models (DLMs) vs. autoregressive (AR) models from scratch — up to….
0
238
0
@p_nawrot
Piotr Nawrot
1 month
RT @2prime_PKU: Anyone knows adam?
Tweet media one
0
463
0
@p_nawrot
Piotr Nawrot
1 month
RT @s_scardapane: *The Sparse Frontier: Sparse Attention Trade-offs in Transformer LLMs*.by @p_nawrot @PontiEdoardo @cheeesio @seb_ruder. T….
0
28
0
@p_nawrot
Piotr Nawrot
1 month
+1.
@PMinervini
Pasquale Minervini
1 month
not sure this is a good idea -- dog puppies need 𝐚 𝐥𝐨𝐭 of sleep and being handled by many strangers can be stressful (this is also why puppy yoga is banned in some countries on animal welfare grounds)
Tweet media one
0
0
7
@p_nawrot
Piotr Nawrot
1 month
RT @PontiEdoardo: Thanks for acknowledging Dynamic Token Pooling as a predecessor to H-Net, @_albertgu!. We had some decent ideas in that p….
0
10
0
@p_nawrot
Piotr Nawrot
1 month
RT @PontiEdoardo: If you are at @icmlconf make sure to attend @AdrianLancucki’s invited talk on our inference-time *hyper*-scaling paper (a….
0
2
0
@p_nawrot
Piotr Nawrot
1 month
RT @tokshop2025: The TokShop schedule is now live! Join us at #ICML2025 for invited talks, poster sessions, and a panel on the future of to….
0
3
0
@p_nawrot
Piotr Nawrot
1 month
RT @sukjun_hwang: Tokenization has been the final barrier to truly end-to-end language models. We developed the H-Net: a hierarchical netw….
0
741
0
@p_nawrot
Piotr Nawrot
1 month
RT @_albertgu: Tokenization is just a special case of "chunking" - building low-level data into high-level abstractions - which is in turn….
0
190
0
@p_nawrot
Piotr Nawrot
2 months
RT @lucalp__: The Bitter Lesson is coming for Tokenization. The Byte Latent Transformer (BLT) showed the possibility of finding additional….
0
6
0
@p_nawrot
Piotr Nawrot
2 months
RT @ori_press: Do language models have algorithmic creativity?. To find out, we built AlgoTune, a benchmark challenging agents to optimize….
0
60
0
@p_nawrot
Piotr Nawrot
2 months
RT @BeidiChen: This is cool!!!.
0
3
0
@p_nawrot
Piotr Nawrot
2 months
RT @iofu728: A very good abstraction of sparse attention in vLLM!🥳.
0
1
0
@p_nawrot
Piotr Nawrot
2 months
RT @vllm_project: glad to see how researchers explore the flexibility of vLLM while still enjoying the performance benefit😁.
0
4
0
@p_nawrot
Piotr Nawrot
2 months
Links:.🔬 sparse-frontier: 📚 nano-sparse-attention: 👨‍🏫 NeurIPS 2024 Dynamic sparsity tutorial: 📄 Paper:
0
4
22
@p_nawrot
Piotr Nawrot
2 months
We built sparse-frontier — a clean abstraction that lets you focus on your custom sparse attention implementation while automatically inheriting vLLM’s optimizations and model support. As a PhD student, I've learned that sometimes the bottleneck in research isn't ideas — it's
Tweet media one
9
52
320
@p_nawrot
Piotr Nawrot
2 months
RT @Marktechpost: NVIDIA Researchers Introduce Dynamic Memory Sparsification (DMS) for 8× KV Cache Compression in Transformer LLMs. As the….
0
13
0
@p_nawrot
Piotr Nawrot
2 months
RT @PontiEdoardo: Last week marked the end of my stay as a visiting professor at @nvidia. During my time there, I became passionate about….
0
4
0
@p_nawrot
Piotr Nawrot
3 months
RT @_akhaliq: Nvidia presents Inference-Time Hyper-Scaling with KV Cache Compression
Tweet media one
0
58
0