Thomas Wolf
@Thom_Wolf
Followers
98K
Following
29K
Media
511
Statuses
5K
Co-founder at @HuggingFace - moonshots - angel
Joined February 2011
Thrilled to finally share what we've been working on for months at @huggingface 🤝@pollenrobotics Our first robot: Reachy Mini A dream come true: cute and low priced, hackable yet easy to use, powered by open-source and the infinite community. Tiny price, small size, huge
241
529
3K
Open-source/science is the most effective way to reduce civilization boot time
4
0
16
Deploy API networks ⚡️ Designs for real-time visibility in @unkeydev With @chronark_ @hezarfendd @james_r_perkins
18
4
209
we need more team like @pleiasfr pushing open work on synthetic data at scale for pre/mid/post-training
Breaking: we release a fully synthetic generalist dataset for pretraining, SYNTH and two new SOTA reasoning models exclusively trained on it. Despite having seen only 200 billion tokens, Baguettotron is currently best-in-class in its size range.
1
8
67
omg the https://t.co/OFreU6jWm5 team just open-sourced 18 TB (!) of egocentric data like treasure trove for robotics models and physical AI
build.ai
today, we’re open sourcing the largest egocentric dataset in history. - 10,000 hours - 2,153 factory workers - 1,080,000,000 frames the era of data scaling in robotics is here. (thread)
4
16
153
2024-25 => first time since DeepMind that the UK startups landscape is hot, burning and exciting again imo but the new proposed « exit tax » might just throw a big bucket of cold water on this phoenix follow/sign the @StartupCltn Open Letter if you care about startup ecosystems
I have invested $200M into the UK in the last five years. If @RachelReevesMP implements “Exit Tax” all that funding will go overnight. That is countless jobs, companies and people who will lose out. Rachel, you have managed to steal our hopes, our dreams even our growth,
2
10
95
I grew up wanting to be a Marine. Determined to be the best. At 16 I was in the Delayed Entry Program for almost 2 years. I drove over an hour every week to do PT on Wednesdays for almost 2 years. At 17 I signed the contract with my father for an 8152/0311 contract. I
0
0
16
Having fun building the @huggingface @ReachyMiniSol robot 🤖 . It’s very accessible! ♥️@ClementDelangue @julien_c @Thom_Wolf
8
13
140
this has been requested so many times by the robotics/LeRobot community, super happy to see EnvHub finally out and in production
0
6
20
Is this another DeepSeek moment? Open-source passing closed-source again Should we expect this every couple months now?
🚀 Hello, Kimi K2 Thinking! The Open-Source Thinking Agent Model is here. 🔹 SOTA on HLE (44.9%) and BrowseComp (60.2%) 🔹 Executes up to 200 – 300 sequential tool calls without human interference 🔹 Excels in reasoning, agentic search, and coding 🔹 256K context window Built
75
77
1K
Meet the Chair Blanket – the ultimate outdoor essential that transforms any seat into a cozy retreat. 🌲 Waterproof on one side, plush Sherpa on the other, and packs up into a portable carry pouch! Perfect for fall bonfires, camping, and game days. 🏕️
3
13
28
Hugging Face science team blog posts on training LLMs
3
5
43
we're doing something special and intimate at @SlushHQ in 2 weeks with a couple of teams you might know :) some infos at: https://t.co/MRpziT3U2o
1
3
23
Despite all the big funding rounds and flashy demos in US robotics, K-Scale’s inability to raise more money should worry us We're at risk of replaying the LLM story all over again in robotics: - Chinese companies are going open-source and collaborating across the value chain
K-scale cancels orders and refunds deposits for kbot. I thought all the VCs were excited about US-based robotics, what happen?
31
40
334
Or ... you could just host them on https://t.co/aeVYPxcibJ
Even when new AI models bring clear improvements in capabilities, deprecating the older generations comes with downsides. An update on how we’re thinking about these costs, and some of the early steps we’re taking to mitigate them:
4
18
205
I like this direction a lot. As we get code writing increasingly automated we’ll want to move to higher level representations of code. a lot of UX to invent here
Introducing Codemaps in @windsurf! powered by SWE-1.5 and Sonnet 4.5 “Your code is your understanding of the problem you’re exploring. So it’s only when you have your code in your head that you really understand the problem.” — @paulg
16
20
396
Monday morning read: a fascinating deep dive in recent Chinese chips developments as well as the coming co-evolution with LLM builders fresh analysis from the HF team => https://t.co/2FEs0IXrnN
11
17
106
saying it way better than I do
The incredible work the @huggingface does in the open truly charts a path towards a post-scarcity future that includes everyone in the fruits of AI.
1
0
43
more reachy-mini skins by the community which one's your favorite?
6
8
88
The work Hugging Face does continues to be incredible. Putting in serious effort to make these topics accessible and detailed. https://t.co/D0ljHFEzpK
6
54
646
We’ve cooked another one of these 200+ pages practical books on model training that we love to write. This time it’s on all pretraining and post-training recipes and how to do a training project hyper parameter exploration. Closing the trilogy of: 1. Building a pretraining
Training LLMs end to end is hard. Very excited to share our new blog (book?) that cover the full pipeline: pre-training, post-training and infra. 200+ pages of what worked, what didn’t, and how to make it run reliably https://t.co/iN2JtWhn23
22
126
1K
After ~4 years building SOTA models & datasets, we're sharing everything we learned in ⚡The Smol Training Playbook We cover the full LLM cycle: designing ablations, choosing an architecture, curating data, post-training, and building solid infrastructure. We'll help you
35
158
1K