hugoch Profile Banner
Hugo Larcher Profile
Hugo Larcher

@hugoch

Followers
477
Following
456
Media
113
Statuses
1K

ML infra/software engineer @huggingface πŸ€—. Making GPUs go "brrr".

Bordeaux, France
Joined August 2007
Don't wanna be here? Send us removal request.
@dylan_ebert_
dylan
5 months
OpenAI just released GPT-OSS: An Open Source Language Model on Hugging Face Open source meaning: πŸ’Έ Free πŸ”’ Private πŸ”§ Customizable
15
37
215
@hugoch
Hugo Larcher
7 months
OMG, the U.S. just downloaded more than 5PB of DeepSeek-R1 on @huggingface in the last few days! Feeling late FOMO in Silicon Valley? πŸ€”πŸš€
2
4
22
@hugoch
Hugo Larcher
9 months
🧡(2/2) With inference-benchmarker you can: πŸ§ͺ Simulate real workloads (chat, code-gen...) πŸ“Š Measure throughput, time-to-first-token, inter-token latency βš™οΈ Compare performance across backends & infra πŸ‘‰ Check it out:
Tweet card summary image
github.com
Inference server benchmarking tool. Contribute to huggingface/inference-benchmarker development by creating an account on GitHub.
0
1
8
@hugoch
Hugo Larcher
9 months
🧠 LLM inference isn’t just about latency β€” it’s about consistency under load. Different workloads, configs, and hardware = very different real-world performances. At Hugging Face πŸ€— we built inference-benchmarker β€” a simple tool to stress-test LLM inference servers. 🧡 (1/2)
2
13
39
@hugoch
Hugo Larcher
10 months
@huggingface GPU-fryer helps us detect silent throttling failures: one GPU slows down and every other unit ends up waiting, creating a bottleneck 🚦. Check it out:
Tweet card summary image
github.com
Where GPUs get cooked πŸ‘©β€πŸ³πŸ”₯. Contribute to huggingface/gpu-fryer development by creating an account on GitHub.
0
4
45
@hugoch
Hugo Larcher
10 months
At @huggingface we rely on GPU-fryer 🍳 to load-test our 768 H100 GPU cluster. It runs matrix multiplications and monitors TFLOPs outliers to catch any software or hardware throttling β€” often a sign of cooling issues that need a hardware fix β„οΈπŸ”§. 🧡 1/2
5
29
254
@hugoch
Hugo Larcher
1 year
This first step will very soon be followed by the integration of new backends (TRT-LLM, llama.cpp, vLLM, Neuron and TPU). We are polishing the TensorRT-LLM backend which achieves impressive performances on NVIDIA GPUs, stay tuned πŸ€— ! https://t.co/eGpEvqVM8L
Tweet card summary image
huggingface.co
0
1
8
@hugoch
Hugo Larcher
1 year
We are introducing multi-backend support in @huggingface Text Generation Inference! With new TGI architecture we are now able to plug new modeling backends to get best performances according to selected model and available hardware.
2
9
59
@ClementDelangue
clem πŸ€—
1 year
Just 10 days after o1's public debut, we’re thrilled to unveil the open-source version of the groundbreaking technique behind its success: scaling test-time compute πŸ§ πŸ’‘ By giving models more "time to think," LLaMA 1B outperforms LLaMA 8B in mathβ€”beating a model 8x its size.
115
621
5K
@AnnInTweetD
Ann Huang
1 year
We're turning @huggingface Hub's files into content-defined chunks to speed up your workflows!⚑️ This means: - 🧠We store your file as deduplicated chunks - ⏩ You only upload changed chunks when iterating! - πŸš€ Pulling changes? Only download changed chunks!
3
17
53
@hugoch
Hugo Larcher
1 year
An easy way to understand Pipeline Parallelism with a self contained implementation. Check it out!
@FerdinandMom
Ferdinand Mom
1 year
Interested in 4D parallelism but feeling overwhelmed by Megatron-LM codebase? We are currently cooking something with @Haojun_Zhao14 and @xariusrke πŸ˜‰ In the meantime, here is a self-contained script that implements Pipeline Parallelism (AFAB + 1F1B) in 200 LOC πŸ§΅πŸ‘‡
1
1
11
@AymericRoucher
m_ric
1 year
New feature on the Hub! ☁️ Carbon emissions emitted during training now show up on the model card! (requires model authors to fill that info first) Hopes it will prompt more people to show the carbon emissions of their model training! 🌍 Thanks a lot to the team who pushed
1
7
28
@huggingface
Hugging Face
1 year
We passed 5 million users. πŸ₯³That's 5 million of you who have signed up on the Hub πŸš€ thank you for contributing to the ecosystem and making open Machine Learning happen! We're just getting started πŸ€—
254
242
2K
@AIatMeta
AI at Meta
1 year
Starting today, open source is leading the way. Introducing Llama 3.1: Our most capable models yet. Today we’re releasing a collection of new Llama 3.1 models including our long awaited 405B. These models deliver improved reasoning capabilities, a larger 128K token context
263
1K
6K
@RemiCadene
Remi Cadene
2 years
I am mind blown by this new technology! AI is now embodied. And we are open-sourcing it all. Listen to @HaixuanT casually discussing with its cute robot at the @linuxfoundation: πŸ™‚ What's your name? > I am Reachy, a robot from @pollenrobotics, I have two arms. πŸ˜€ What do you
3
25
108
@granawkins
Grantβ™ŸοΈ
2 years
sota RAG in 2024
62
225
1K
@_philschmid
Philipp Schmid
2 years
Llama 3 released! πŸš¨πŸ””@AIatMeta just released their best open LLM! πŸ‘‘πŸš€Β Llama 3 is the next iteration of Llama with a ~10% relative improvement to its predecessor! 🀯 Llama 3 comes in 2 different sizes 8B and 70B with a new extended tokenizer and commercially permissive license!
6
60
254
@osanseviero
Omar Sanseviero
2 years
Introducing: Zephyr 141B-A35B πŸ₯ πŸ”₯Mixtral-8x22B fine-tune 🀯 Using DORPO: new alignment algorithm (no SFT, open ) πŸš€ With 7k instances of (open) data Very strong IFEval, BBH, AGIEval... Enjoy! πŸ€— https://t.co/MVxTJorIGc
Tweet card summary image
huggingface.co
15
131
711
@Thom_Wolf
Thomas Wolf
2 years
this 30-min-read blog post on how to craft and generate a 25B+ tokens synthetic text dataset distills more information and alphas than a typical NeurIPS best paper
4
104
738
@hugoch
Hugo Larcher
2 years
Huge spatial images dataset released by @ESA_EO and @huggingface πŸ›°οΈ so much to build on it!
@ESA_EO
ESA Earth Observation
2 years
.@esa's Ξ¦-lab has released, in partnership with @huggingface, the 1st dataset of Major TOM (Terrestrial Observation Metaset), the largest, community-oriented, ML-ready collection of @CopernicusEU #Sentinel2 images ever published and covering over 50% of : https://t.co/IZS4K6YZaC
0
0
11