Clémentine Fourrier 🍊 is off till Dec 2026 hiking @clefourrier X Profile

Clémentine Fourrier 🍊 is off till Dec 2026 hiking

@clefourrier

Followers

6K

Following

11K

Media

208

Statuses

4K

Evals/dogs @HuggingFace ✨ "The future is already here, it’s just not very evenly distributed" (Gibson)

https://t.co/rB0EzLmjri

Joined October 2019

Don't wanna be here? Send us removal request.

Clémentine Fourrier 🍊 is off till Dec 2026 hiking

@clefourrier

20 days

Hey twitter! I'm releasing the LLM Evaluation Guidebook v2! Updated, nicer to read, interactive graphics, etc! https://t.co/xG4VQOj2wN After this, I'm off: I'm taking a sabbatical to go hike with my dogs :D (back @huggingface in Dec *2026*) See you all next year!

22

165

992

Jimmy Lin

@lintool

18 days

👀Introducing a brand new @yupp_ai SVG leaderboard ranking frontier models on the generation of coherent and visually appealing SVGs! Gemini 3 Pro by @GoogleDeepMind takes the crown as the most powerful model! 👏 We’re also releasing a public SVG dataset. Details in🧵

32

68

459

François Chollet

@fchollet

18 days

Either you crack general intelligence -- the ability to efficiently acquire arbitrary skills on your own -- or you don't have AGI. A big pile of task-specific skills memorized from handcrafted/generated environments isn't AGI, not matter how big.

Dwarkesh Patel

@dwarkesh_sp

20 days

New post: Thoughts on AI progress (Dec 2025) 1. What are we scaling?

106

116

1K

Lucas Atkins

@latkins

19 days

https://t.co/oY5IC2XrUC

Clémentine Fourrier 🍊 is off till Dec 2026 hiking

@clefourrier

20 days

Hey twitter! I'm releasing the LLM Evaluation Guidebook v2! Updated, nicer to read, interactive graphics, etc! https://t.co/xG4VQOj2wN After this, I'm off: I'm taking a sabbatical to go hike with my dogs :D (back @huggingface in Dec *2026*) See you all next year!

1

2

22

Clémentine Fourrier 🍊 is off till Dec 2026 hiking

@clefourrier

19 days

@huggingface cc @maximelabonne since you wanted an update :P

1

0

11

Clémentine Fourrier 🍊 is off till Dec 2026 hiking

@clefourrier

20 days

If you see improvements, I'd love to hear them (within the next 2 days) :) Many thanks to @thibaudfrere for his help on the banner and @gui_penedo for his proofreading! If you've got eval needs, your new PoC is @nathanhabib1011 (with a focus on lighteval)!

1

0

21

Clémentine Fourrier 🍊 is off till Dec 2026 hiking

@clefourrier

20 days

The guide is very beginner friendly, as we go from the basics of tokenization/inference to the nits and tricks of running eval properly, so it's compatible with all levels. Should contain most of what we wrote about evals at HF in a single unified place, with updates ofc :)

1

29

elie

@eliebakouch

20 days

as a researcher, it makes no sense to compare reasoning vs non reasoning models on benches like the ones in Artificial Analysis without normalizing somehow by cost or output tokens. non reasoning models (base/instruct) are important for the open ecosystem since research teams and

8

93

merve

@mervenoyann

20 days

Mistral has delivered super capable small models but no one is talking about it so here I go

32

44

586

Lysandre

@LysandreJik

21 days

Transformers v5's first release candidate is out 🔥 The biggest release of my life. It's been five years since the last major (v4). From 20 architectures to 400, 20k daily downloads to 3 million. The release is huge, w/ tokenization (no slow tokenizers!), modeling & processing.

20

89

573

Xeophon

@xeophon

24 days

stop looking at HLE (with tools), most of these mean "has web access" the answers to HLE are easily accessible in ungated mirrors (and prob a dozen other places). the only question is why those agents don't score 100%

Ivan Fioravanti ᯅ

@ivanfioravanti

24 days

This 8B beast from NVIDIA is a fine-tuning of Qwen3-8B! 37.1 on Humanity's Last Exam!

8

12

149

Shaltiel

@SShmidman

27 days

Okay, but, wait, what reasoning traces should I train on? Excited to share our latest research paper together with @nvidia: Learning to Reason: Training LLMs with GPT-OSS or DeepSeek R1 Reasoning Traces https://t.co/ddktyLoJt7 🧵

arxiv.org

Test-time scaling, which leverages additional computation during inference to improve model accuracy, has enabled a new class of Large Language Models (LLMs) that are able to reason through...

3

8

31

Adina Yakup

@AdinaYakup

27 days

China just passed the U.S. in open model downloads for the first time 👀 New data from Economies of Open Intelligence led by @huggingface policy team & community collaborators, presents some notable observations: ✨ Developer adoption In 2025, Chinese model developers saw

1

26

89

David Louapre

@dlouapre

27 days

Introducing "The Eiffel Tower Llama"!🗼 Remember Golden Gate Claude? Unfortunately Anthropic's viral demo was shut down after 24h, and key technical details remained hidden. So we recreated it, uncovering key insights on steering LLMs using SAEs⚒️ Full blog post + live demo 👇

9

41

175

Sayak Paul

@RisingSayak

27 days

Non-natural image gen and editing are difficult tasks. We tested the state of the art at the time — including Nano Banana 1.0 & GPT-image — all performed quite poorly on StructBench. Nano Banana 2 (NB2) just dropped, and its improvements strongly validate a direction we studied

5

10

78

Zuhaitz

@zuhaitz_dev

27 days

"The most underappreciated legend of the tech industry?" I see posts like this one every day 😭 And, obviously he is a respected professional, but he is far from underappreciated. Check Sophie Wilson. Most people haven't heard about her, but she is the primary architect of the

Pallav Joshi

@vibingmonk

28 days

> created Linux kernel at 21 > built Git because nothing else was good enough > becomes backbone of servers, Android, cloud, supercomputers > never chased fame, money, titles, hype > stays private, consistent, brutally honest for decades > still reviews patches, still

61

512

7K

Ahmad Beirami

@abeirami

28 days

"Professors definitely deserve to have their names on the papers." I think this take is completely wrong. Financial support does not warrant co-authorship. Bob Gallager (a legendary information theorist who retired from MIT) did not co-author any papers with many of his

Zhipeng(Jason Z) Wang 🇺🇦

@PKUWZP

29 days

@shaananc @thegautamkamath I agree with most of your statement. However, there’s no “simply” leading a group or advising PhD students. Those activities require tremendous efforts both intellectually and financially. Not to say that in the US, all of PhD students’ funding comes from professors’ grant money.

16

19

360

Charlie Snell

@sea_snell

28 days

What happened to adding error bars to evals?

Yuchen Jin

@Yuchenj_UW

28 days

Claude Opus 4.5's score on SWE-bench is wild. I like how Anthropic has focused on coding from the beginning. They haven’t released any image or video models. All in the most economically valuable area. Good strategy.

17

31

910

Clémentine Fourrier 🍊 is off till Dec 2026 hiking

@clefourrier

2 months

Know anyone who need some help to get started with ML, open source and 🤗? We partnered with @TechToTheRescue, a tech for good incubator, & answered all AI questions their non profits had to create an FAQ! https://t.co/psgu8N1cSm Come add your Q/As, it's collaborative! 🔥

2

3

12

Georgia Channing

@cgeorgiaw

28 days

incredibly detailed technical blog just dropped on the anatomy of BoltzGen 🧬 made for ML people, but covering everything from molecular representations to diffusion-based generation of protein binders + crazy good interactive visuals 👏👏 @ludocomito

6

37

233