maxencelsb Profile Banner
max Profile
max

@maxencelsb

Followers
31
Following
420
Media
14
Statuses
78

💻 GitHub: https://t.co/hsxfYxPd8j 🤗 HF: https://t.co/HTLvRfgnoR

Paris, France
Joined September 2024
Don't wanna be here? Send us removal request.
@maximelabonne
Maxime Labonne
1 month
📚 Efficient Language Specialization for Small Language Models @maxencelsb and @SinoueG have released a preprint about their excellent work on fine-tuning small models in French. It shows a solid post-training pipeline to improve French performance while preserving English
5
17
121
@LiquidAI_
Liquid AI
3 months
Meet Luth-LFM2. A French fine-tuned LFM2 instance designed by Maxence Lasbordes and Sinoué GAD, to enhance multilingual capabilities of LFM2! In this model class, Luth-LFM2 sets a new record in French instruction following, GPQA, MMLU and math.
19
31
129
@maximelabonne
Maxime Labonne
3 months
Really impressed by the French finetune of LFM2 made by two students. They created a solid post-training pipeline (FFT + merging) and open-sourced all the code and data. Amazing work by Sinoué Gad and Maxence Lasbordes!
8
25
214
@Alibaba_Qwen
Qwen
3 months
🚀 Introducing Qwen3-4B-Instruct-2507 & Qwen3-4B-Thinking-2507 — smarter, sharper, and 256K-ready! 🔹 Instruct: Boosted general skills, multilingual coverage, and long-context instruction following. 🔹 Thinking: Advanced reasoning in logic, math, science & code — built for
143
401
3K
@maxencelsb
max
5 months
GitHub: https://t.co/lcHrCTB0iz Models: https://t.co/tljsIlS9M3 Dataset: https://t.co/Q692DDrEy9 Feel free to leave a star ⭐!
Tweet card summary image
huggingface.co
0
0
0
@maxencelsb
max
5 months
✨ Sharing my most recent side project: LeCarnet, a synthetic dataset of 2M+ French children's stories generated with Mistral Large inspired by TinyStories. Implemented the data generation, training, and eval pipelines. Also trained 3 SLMs on the dataset: LeCarnet-3M/8M/21M.
1
0
3
@raphaelsrty
Raphaël Sourty
5 months
I'm thrilled to announce the release of FastPlaid ! 🚀🚀 FastPlaid is a high-performance engine for multi-vector search, built from the ground up in Rust (with the help of Torch C++)⚡️ You can view FastPlaid as the counterpart of Faiss for multi vectors.
10
40
250
@antoine_chaffin
Antoine Chaffin
7 months
Among all those LLM releases, here is an important retrieval release: To overcome limitations of awesome ModernBERT-based dense models, today @LightOnIO is releasing GTE-ModernColBERT, the very first state-of-the-art late-interaction (multi-vectors) model trained using PyLate🚀
9
59
255
@sama
Sam Altman
8 months
TL;DR: we are excited to release a powerful new open-weight language model with reasoning in the coming months, and we want to talk to devs about how to make it maximally useful: https://t.co/XKB4XxjREV we are excited to make this a very, very good model! __ we are planning to
Tweet card summary image
openai.com
We’re planning to release our first open language model since GPT‑2 in the coming months. We’re excited to collaborate with developers, researchers, and the broader community to gather inputs and...
1K
1K
13K
@maxencelsb
max
8 months
Read the PPO paper. The idea of the clipping mechanism is so smart
0
0
0
@maxencelsb
max
8 months
grok one shots everything its crazy
0
0
0
@maxencelsb
max
8 months
Read the TRGPPO paper today, which addresses the exploration issue of PPO under poor policy initialization https://t.co/k8rWLrMY5u
0
0
0
@maxencelsb
max
8 months
Read the CutMix paper for uni. Really smart data aug. technique https://t.co/C1MI5UjkNF
0
0
0
@maxencelsb
max
8 months
Read the Grad-CAM paper. Uses gradients to generate heatmaps, highlighting important regions in images for CNN decisions. https://t.co/iIVN9ULK2p
0
0
0
@maxencelsb
max
8 months
My latest project: FlashAttention-2 in Triton for Sliding Window Attention. - Forward & Backward pass - Sliding Window/Causal/Global Attention - 2-10x TFLOPs/s increase compared to standard PyTorch attention - Tiled matmul optimization - Online softmax https://t.co/XupyAfHzJX
Tweet card summary image
github.com
FlashAttention for sliding window attention in Triton (fwd + bwd pass) - MaxLSB/flash-attn2
0
0
1
@maxencelsb
max
9 months
Went to AI Action Summit today #AIActionSummit
0
0
1
@maxencelsb
max
9 months
The goat dropped a new banger: https://t.co/WO0gXDhJXA
0
0
0
@maxencelsb
max
9 months
Currently wondering whether I should learn Triton or Cuda first
0
0
1
@maxencelsb
max
10 months
RMS Norm implementation.
0
0
0