clefourrier Profile Banner
Clémentine Fourrier 🍊 is off till Dec 2026 hiking Profile
Clémentine Fourrier 🍊 is off till Dec 2026 hiking

@clefourrier

Followers
6K
Following
11K
Media
208
Statuses
4K

Evals/dogs @HuggingFace ✨ "The future is already here, it’s just not very evenly distributed" (Gibson)

Joined October 2019
Don't wanna be here? Send us removal request.
@clefourrier
Clémentine Fourrier 🍊 is off till Dec 2026 hiking
20 days
Hey twitter! I'm releasing the LLM Evaluation Guidebook v2! Updated, nicer to read, interactive graphics, etc! https://t.co/xG4VQOj2wN After this, I'm off: I'm taking a sabbatical to go hike with my dogs :D (back @huggingface in Dec *2026*) See you all next year!
22
165
992
@lintool
Jimmy Lin
18 days
👀Introducing a brand new @yupp_ai SVG leaderboard ranking frontier models on the generation of coherent and visually appealing SVGs! Gemini 3 Pro by @GoogleDeepMind takes the crown as the most powerful model! 👏 We’re also releasing a public SVG dataset. Details in🧵
32
68
459
@fchollet
François Chollet
18 days
Either you crack general intelligence -- the ability to efficiently acquire arbitrary skills on your own -- or you don't have AGI. A big pile of task-specific skills memorized from handcrafted/generated environments isn't AGI, not matter how big.
@dwarkesh_sp
Dwarkesh Patel
20 days
New post: Thoughts on AI progress (Dec 2025) 1. What are we scaling?
106
116
1K
@latkins
Lucas Atkins
19 days
@clefourrier
Clémentine Fourrier 🍊 is off till Dec 2026 hiking
20 days
Hey twitter! I'm releasing the LLM Evaluation Guidebook v2! Updated, nicer to read, interactive graphics, etc! https://t.co/xG4VQOj2wN After this, I'm off: I'm taking a sabbatical to go hike with my dogs :D (back @huggingface in Dec *2026*) See you all next year!
1
2
22
@clefourrier
Clémentine Fourrier 🍊 is off till Dec 2026 hiking
19 days
@huggingface cc @maximelabonne since you wanted an update :P
1
0
11
@clefourrier
Clémentine Fourrier 🍊 is off till Dec 2026 hiking
20 days
If you see improvements, I'd love to hear them (within the next 2 days) :) Many thanks to @thibaudfrere for his help on the banner and @gui_penedo for his proofreading! If you've got eval needs, your new PoC is @nathanhabib1011 (with a focus on lighteval)!
1
0
21
@clefourrier
Clémentine Fourrier 🍊 is off till Dec 2026 hiking
20 days
The guide is very beginner friendly, as we go from the basics of tokenization/inference to the nits and tricks of running eval properly, so it's compatible with all levels. Should contain most of what we wrote about evals at HF in a single unified place, with updates ofc :)
1
1
29
@eliebakouch
elie
20 days
as a researcher, it makes no sense to compare reasoning vs non reasoning models on benches like the ones in Artificial Analysis without normalizing somehow by cost or output tokens. non reasoning models (base/instruct) are important for the open ecosystem since research teams and
8
8
93
@mervenoyann
merve
20 days
Mistral has delivered super capable small models but no one is talking about it so here I go
32
44
586
@LysandreJik
Lysandre
21 days
Transformers v5's first release candidate is out 🔥 The biggest release of my life. It's been five years since the last major (v4). From 20 architectures to 400, 20k daily downloads to 3 million. The release is huge, w/ tokenization (no slow tokenizers!), modeling & processing.
20
89
573
@xeophon
Xeophon
24 days
stop looking at HLE (with tools), most of these mean "has web access" the answers to HLE are easily accessible in ungated mirrors (and prob a dozen other places). the only question is why those agents don't score 100%
@ivanfioravanti
Ivan Fioravanti ᯅ
24 days
This 8B beast from NVIDIA is a fine-tuning of Qwen3-8B! 37.1 on Humanity's Last Exam!
8
12
149
@SShmidman
Shaltiel
27 days
Okay, but, wait, what reasoning traces should I train on? Excited to share our latest research paper together with @nvidia: Learning to Reason: Training LLMs with GPT-OSS or DeepSeek R1 Reasoning Traces https://t.co/ddktyLoJt7 🧵
Tweet card summary image
arxiv.org
Test-time scaling, which leverages additional computation during inference to improve model accuracy, has enabled a new class of Large Language Models (LLMs) that are able to reason through...
3
8
31
@AdinaYakup
Adina Yakup
27 days
China just passed the U.S. in open model downloads for the first time 👀 New data from Economies of Open Intelligence led by @huggingface policy team & community collaborators, presents some notable observations: ✨ Developer adoption In 2025, Chinese model developers saw
1
26
89
@dlouapre
David Louapre
27 days
Introducing "The Eiffel Tower Llama"!🗼 Remember Golden Gate Claude? Unfortunately Anthropic's viral demo was shut down after 24h, and key technical details remained hidden. So we recreated it, uncovering key insights on steering LLMs using SAEs⚒️ Full blog post + live demo 👇
9
41
175
@RisingSayak
Sayak Paul
27 days
Non-natural image gen and editing are difficult tasks. We tested the state of the art at the time — including Nano Banana 1.0 & GPT-image — all performed quite poorly on StructBench. Nano Banana 2 (NB2) just dropped, and its improvements strongly validate a direction we studied
5
10
78
@zuhaitz_dev
Zuhaitz
27 days
"The most underappreciated legend of the tech industry?" I see posts like this one every day 😭 And, obviously he is a respected professional, but he is far from underappreciated. Check Sophie Wilson. Most people haven't heard about her, but she is the primary architect of the
@vibingmonk
Pallav Joshi
28 days
> created Linux kernel at 21 > built Git because nothing else was good enough > becomes backbone of servers, Android, cloud, supercomputers > never chased fame, money, titles, hype > stays private, consistent, brutally honest for decades > still reviews patches, still
61
512
7K
@abeirami
Ahmad Beirami
28 days
"Professors definitely deserve to have their names on the papers." I think this take is completely wrong. Financial support does not warrant co-authorship. Bob Gallager (a legendary information theorist who retired from MIT) did not co-author any papers with many of his
@PKUWZP
Zhipeng(Jason Z) Wang 🇺🇦
29 days
@shaananc @thegautamkamath I agree with most of your statement. However, there’s no “simply” leading a group or advising PhD students. Those activities require tremendous efforts both intellectually and financially. Not to say that in the US, all of PhD students’ funding comes from professors’ grant money.
16
19
360
@sea_snell
Charlie Snell
28 days
What happened to adding error bars to evals?
@Yuchenj_UW
Yuchen Jin
28 days
Claude Opus 4.5's score on SWE-bench is wild. I like how Anthropic has focused on coding from the beginning. They haven’t released any image or video models. All in the most economically valuable area. Good strategy.
17
31
910
@clefourrier
Clémentine Fourrier 🍊 is off till Dec 2026 hiking
2 months
Know anyone who need some help to get started with ML, open source and 🤗? We partnered with @TechToTheRescue, a tech for good incubator, & answered all AI questions their non profits had to create an FAQ! https://t.co/psgu8N1cSm Come add your Q/As, it's collaborative! 🔥
2
3
12
@cgeorgiaw
Georgia Channing
28 days
incredibly detailed technical blog just dropped on the anatomy of BoltzGen 🧬 made for ML people, but covering everything from molecular representations to diffusion-based generation of protein binders + crazy good interactive visuals 👏👏 @ludocomito
6
37
233