Simon Jegou Profile
Simon Jegou

@simon_jegou

Followers
559
Following
372
Media
23
Statuses
181

Senior LLM Technologist @NVIDIA Views and opinions are my own

Joined June 2016
Don't wanna be here? Send us removal request.
@simon_jegou
Simon Jegou
18 hours
RT @devoto_alessio: 🏆 Our @nvidia KV Cache Compression Leaderboard is now live! . Compare state-of-the-art compression methods side-by-sid….
0
36
0
@simon_jegou
Simon Jegou
3 days
RT @igtmn: We've released a series of OpenReasoning-Nemotron models (1.5B, 7B, 14B and 32B) that set new SOTA on a wide range of reasoning….
0
47
0
@simon_jegou
Simon Jegou
2 months
More than one point is needed to draw a Pareto frontier 😅
Tweet media one
1
5
41
@simon_jegou
Simon Jegou
3 months
RT @gonedarragh: AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset.abs: https….
0
15
0
@simon_jegou
Simon Jegou
4 months
* 1 min on M3 Pro without any parallelization on the 2^19 states.
0
0
0
@simon_jegou
Simon Jegou
4 months
Common 🇫🇷 variant where four of a kind is 40points:
Tweet card summary image
huggingface.co
1
0
0
@simon_jegou
Simon Jegou
4 months
Implementation is based on the excellent paper "An Optimal Strategy for Yahtzee" by @DrGlennNFA (2006): (2/2).
1
0
0
@simon_jegou
Simon Jegou
4 months
🎲 Did you know Yahtzee can be solved optimally in less than 100 lines of Python and under 5 minutes with 2 vCPU? I built a @Gradio app so you can try it yourself: (1/2).
huggingface.co
3
6
24
@simon_jegou
Simon Jegou
5 months
RT @a_erdem4: logits-processor-zoo: New version (0.1.3) is out 🎉. You can now tune the thinking duration of R1 models and force them to sta….
0
7
0
@simon_jegou
Simon Jegou
6 months
RT @NVIDIAAIDev: The growing context windows of LLMs unlock new possibilities but pose significant memory challenges with linearly scaling….
0
45
0
@simon_jegou
Simon Jegou
6 months
RT @giffmana: This actually reproduces as of today. In 5 out of 8 generations, DeepSeekV3 claims to be ChatGPT (v4), while claiming to be D….
0
198
0
@simon_jegou
Simon Jegou
6 months
RT @pcuenq: Great post by nvidia (published in Hugging Face ❤️) about KVPress, a library to compress transformers KV Cache and thus reduce….
0
32
0
@simon_jegou
Simon Jegou
6 months
-> Repo: (⭐️ me) -> Blog post: (🔺 me).-> Space: (❤️ me). (2/2).
huggingface.co
0
1
2
@simon_jegou
Simon Jegou
6 months
Thanks a lot @ariG23498 for your contributions in this release !.
0
0
1
@simon_jegou
Simon Jegou
6 months
Fresh news from kvpress, our open source library for KV cache compression 🔥. 1. We published a blog post with @huggingface.2. We published a Space for you to try it.3. Following feedback from the research community, we added a bunch of presses and benchmarks. Links👇(1/2)
Tweet media one
2
8
45
@simon_jegou
Simon Jegou
6 months
RT @ariG23498: ⚠️1 million tokens on Llama 3-70B running in bfloat16 precision would take ~ 470 GB of memory. The KV cache alone accounts….
0
4
0
@simon_jegou
Simon Jegou
7 months
RT @ariG23498: Controlling text generation has always been a problem. What if I wanted my LLM to always end with a particular phrase, or wa….
0
32
0
@simon_jegou
Simon Jegou
7 months
New version of the notebook using the sinkhorn operator for a fully differentiable relaxation:
kaggle.com
Explore and run machine learning code with Kaggle Notebooks | Using data from multiple data sources
0
1
5
@simon_jegou
Simon Jegou
8 months
How do you find the permutation of words that minimize their perplexity as measured by an LLM ? In this year @kaggle Santa, I shared an approach to move to a continuous space where you can use gradient-descent using REINFORCE:
kaggle.com
Explore and run machine learning code with Kaggle Notebooks | Using data from multiple data sources
1
3
15
@simon_jegou
Simon Jegou
8 months
Thanks @MichaelHassid for your contribution to kvpress, we hope other researchers will follow your footsteps 🔥.
@MichaelHassid
Michael Hassid
8 months
Great KV compression repo from @nvidia and @simon_jegou. Our TOVA policy from "Transformers are Multi-State RNNs" ( was just added there.
0
1
3