Hugo Larcher @hugoch X Profile

Hugo Larcher

@hugoch

Followers

477

Following

456

Media

113

Statuses

1K

ML infra/software engineer @huggingface 🤗. Making GPUs go "brrr".

Bordeaux, France

Joined August 2007

Don't wanna be here? Send us removal request.

dylan

@dylan_ebert_

5 months

OpenAI just released GPT-OSS: An Open Source Language Model on Hugging Face Open source meaning: 💸 Free 🔒 Private 🔧 Customizable

15

37

215

Hugo Larcher

@hugoch

7 months

OMG, the U.S. just downloaded more than 5PB of DeepSeek-R1 on @huggingface in the last few days! Feeling late FOMO in Silicon Valley? 🤔🚀

2

4

22

Hugo Larcher

@hugoch

9 months

🧵(2/2) With inference-benchmarker you can: 🧪 Simulate real workloads (chat, code-gen...) 📊 Measure throughput, time-to-first-token, inter-token latency ⚙️ Compare performance across backends & infra 👉 Check it out:

github.com

Inference server benchmarking tool. Contribute to huggingface/inference-benchmarker development by creating an account on GitHub.

0

1

8

Hugo Larcher

@hugoch

9 months

🧠 LLM inference isn’t just about latency — it’s about consistency under load. Different workloads, configs, and hardware = very different real-world performances. At Hugging Face 🤗 we built inference-benchmarker — a simple tool to stress-test LLM inference servers. 🧵 (1/2)

2

13

39

Hugo Larcher

@hugoch

10 months

@huggingface GPU-fryer helps us detect silent throttling failures: one GPU slows down and every other unit ends up waiting, creating a bottleneck 🚦. Check it out:

github.com

Where GPUs get cooked 👩‍🍳🔥. Contribute to huggingface/gpu-fryer development by creating an account on GitHub.

0

4

45

Hugo Larcher

@hugoch

10 months

At @huggingface we rely on GPU-fryer 🍳 to load-test our 768 H100 GPU cluster. It runs matrix multiplications and monitors TFLOPs outliers to catch any software or hardware throttling — often a sign of cooling issues that need a hardware fix ❄️🔧. 🧵 1/2

5

29

254

Hugo Larcher

@hugoch

1 year

This first step will very soon be followed by the integration of new backends (TRT-LLM, llama.cpp, vLLM, Neuron and TPU). We are polishing the TensorRT-LLM backend which achieves impressive performances on NVIDIA GPUs, stay tuned 🤗 ! https://t.co/eGpEvqVM8L

huggingface.co

0

1

8

Hugo Larcher

@hugoch

1 year

We are introducing multi-backend support in @huggingface Text Generation Inference! With new TGI architecture we are now able to plug new modeling backends to get best performances according to selected model and available hardware.

2

9

59

clem 🤗

@ClementDelangue

1 year

Just 10 days after o1's public debut, we’re thrilled to unveil the open-source version of the groundbreaking technique behind its success: scaling test-time compute 🧠💡 By giving models more "time to think," LLaMA 1B outperforms LLaMA 8B in math—beating a model 8x its size.

115

621

5K

Ann Huang

@AnnInTweetD

1 year

We're turning @huggingface Hub's files into content-defined chunks to speed up your workflows!⚡️ This means: - 🧠We store your file as deduplicated chunks - ⏩ You only upload changed chunks when iterating! - 🚀 Pulling changes? Only download changed chunks!

3

17

53

Hugo Larcher

@hugoch

1 year

An easy way to understand Pipeline Parallelism with a self contained implementation. Check it out!

Ferdinand Mom

@FerdinandMom

1 year

Interested in 4D parallelism but feeling overwhelmed by Megatron-LM codebase? We are currently cooking something with @Haojun_Zhao14 and @xariusrke 😉 In the meantime, here is a self-contained script that implements Pipeline Parallelism (AFAB + 1F1B) in 200 LOC 🧵👇

1

11

m_ric

@AymericRoucher

1 year

New feature on the Hub! ☁️ Carbon emissions emitted during training now show up on the model card! (requires model authors to fill that info first) Hopes it will prompt more people to show the carbon emissions of their model training! 🌍 Thanks a lot to the team who pushed

1

7

28

Hugging Face

@huggingface

1 year

We passed 5 million users. 🥳That's 5 million of you who have signed up on the Hub 🚀 thank you for contributing to the ecosystem and making open Machine Learning happen! We're just getting started 🤗

254

242

2K

AI at Meta

@AIatMeta

1 year

Starting today, open source is leading the way. Introducing Llama 3.1: Our most capable models yet. Today we’re releasing a collection of new Llama 3.1 models including our long awaited 405B. These models deliver improved reasoning capabilities, a larger 128K token context

263

1K

6K

Remi Cadene

@RemiCadene

2 years

I am mind blown by this new technology! AI is now embodied. And we are open-sourcing it all. Listen to @HaixuanT casually discussing with its cute robot at the @linuxfoundation: 🙂 What's your name? > I am Reachy, a robot from @pollenrobotics, I have two arms. 😀 What do you

3

25

108

Grant♟️

@granawkins

2 years

sota RAG in 2024

62

225

1K

Philipp Schmid

@_philschmid

2 years

Llama 3 released! 🚨🔔@AIatMeta just released their best open LLM! 👑🚀 Llama 3 is the next iteration of Llama with a ~10% relative improvement to its predecessor! 🤯 Llama 3 comes in 2 different sizes 8B and 70B with a new extended tokenizer and commercially permissive license!

6

60

254

Omar Sanseviero

@osanseviero

2 years

Introducing: Zephyr 141B-A35B 🥁 🔥Mixtral-8x22B fine-tune 🤯 Using DORPO: new alignment algorithm (no SFT, open ) 🚀 With 7k instances of (open) data Very strong IFEval, BBH, AGIEval... Enjoy! 🤗 https://t.co/MVxTJorIGc

huggingface.co

15

131

711

Thomas Wolf

@Thom_Wolf

2 years

this 30-min-read blog post on how to craft and generate a 25B+ tokens synthetic text dataset distills more information and alphas than a typical NeurIPS best paper

4

104

738

Hugo Larcher

@hugoch

2 years

Huge spatial images dataset released by @ESA_EO and @huggingface 🛰️ so much to build on it!

ESA Earth Observation

@ESA_EO

2 years

.@esa's Φ-lab has released, in partnership with @huggingface, the 1st dataset of Major TOM (Terrestrial Observation Metaset), the largest, community-oriented, ML-ready collection of @CopernicusEU #Sentinel2 images ever published and covering over 50% of : https://t.co/IZS4K6YZaC

0

11