Daniel Vila Suero @dvilasuero X Profile

Daniel Vila Suero

@dvilasuero

Followers

4K

Following

8K

Media

386

Statuses

4K

ML & data @huggingface

Joined January 2011

Don't wanna be here? Send us removal request.

Daniel Vila Suero

@dvilasuero

6 months

Introducing @huggingface Sheets: Excel meets AI and unstructured data 📁 Run prompts and models over your data 🌐 Web search for accuracy and real-time information 🎯 Manual edits are used to improve generation 💯 Hundreds of open models and leading inference providers 1/2

2

16

82

Daniel Vila Suero

@dvilasuero

5 days

Let me know if you try this out, looking for early feedback

Daniel Vila Suero

@dvilasuero

6 days

🔥Direct dataset editing on the Hugging Face Hub🔥 No more downloading & reuploading a 500MB CSV just to fix 3 mislabeled rows. Edit cells, commit changes. Collaborative, fully versioned. Dataset curation finally works like code. https://t.co/HXfES1OCyy

1

0

2

Prince Canuma

@Prince_Canuma

6 days

Love this 🔥🙌🏽

Daniel Vila Suero

@dvilasuero

6 days

🔥Direct dataset editing on the Hugging Face Hub🔥 No more downloading & reuploading a 500MB CSV just to fix 3 mislabeled rows. Edit cells, commit changes. Collaborative, fully versioned. Dataset curation finally works like code. https://t.co/HXfES1OCyy

1

2

12

Daniel Vila Suero

@dvilasuero

6 days

🔥Direct dataset editing on the Hugging Face Hub🔥 No more downloading & reuploading a 500MB CSV just to fix 3 mislabeled rows. Edit cells, commit changes. Collaborative, fully versioned. Dataset curation finally works like code. https://t.co/HXfES1OCyy

6

8

22

🍉 Abubakar Abid

@abidlabs

19 days

We're launching the world's biggest AI virtual hackathon, with more than 6,000 registrations (and >$1 million in prizes and credits) Join us at the kickoff event in 30 minutes 🔥

Gradio

@Gradio

19 days

Join us LIVE at MCP's first Birthday kickoff at 10 am PT today!🎂 Don't miss out on details about the celebration from the co-hosts, @Gradio and @AnthropicAI. 🔥 We've also got an exciting lineup of speakers from @Huggingface, @OpenAI, @GoogleDeepMind, @modal, @blaxelAI,

1

13

Chakra Labs

@chakra_ai

19 days

Update: Dojo now supports the OpenEnv spec.

11

29

111

Jeff Ma

@18jeffreyma

21 days

We’re launching SWE-fficiency to eval whether LMs can speed up real GitHub repos on real workloads! ⏱️ 498 optimization tasks across 9 data-science, ML, and HPC repos — each with a real workload to speed up. Existing agents struggle to match expert level optimizations!

12

23

199

Nathan

@nathanhabib1011

20 days

🔥 IT'S OUT 🔥 Struggling to find benchmarks? Explore our repository for THOUSANDS of easy-to-run, well-documented tasks 🤩 📏 Creators, add yours for more visibility 🔥 Users, find and run models effortlessly with lighteval

4

11

39

Daniel Vila Suero

@dvilasuero

21 days

Open models are reaching the frontier at an impressive pace. More choices mean more decisions: which model? Local or managed? Which provider? Cheaper or faster? As the offer grows, so does the need for better testing. That's why I built a new integration with Inspect AI for

2

9

21

Daniel Vila Suero

@dvilasuero

21 days

5/You can also compare :fastest vs :cheapest providers, write custom scorers, and test models in agentic scenarios. All without setting up infrastructure. Drop a comment if you'd like me to run an eval you have in mind on the latest open models.

0

Daniel Vila Suero

@dvilasuero

21 days

4/ Evaluate vision-language models with custom datasets I even built a "chihuahua or muffin" style dataset to test VLMs. Results: Qwen3-VL-8B: 0.7 accuracy Qwen3-VL-30B-Thinking: 0.9 accuracy Still unsolved 😂 ~30 lines of Python for custom evals.

1

0

1

Daniel Vila Suero

@dvilasuero

21 days

3/ Compare the same model across providers I ran gpt-oss-120b across 10 providers on the same task. Scores ranged from 0.80 to 0.84. Hardware and inference implementations vary. Testing helps you find what actually works best for your use case.

1

0

Daniel Vila Suero

@dvilasuero

21 days

2/ Benchmark multiple models on your task Pick models from the Hub, run them in parallel. No GPUs, no downloads—just: inspect eval-set https://t.co/e7703eAxCz --model \ "hf-inference-providers/MiniMaxAI/MiniMax-M2,\ hf-inference-providers/openai/gpt-oss-120b"

1

0

Daniel Vila Suero

@dvilasuero

21 days

Open models are reaching the frontier at an impressive pace. More choices mean more decisions: which model? Local or managed? Which provider? Cheaper or faster? As the offer grows, so does the need for better testing. That's why I built a new integration with Inspect AI for

2

9

21

clem 🤗

@ClementDelangue

26 days

Unsurprisingly, Kimi K2 Thinking is already number one trending on HF. The AI frontier is open-source!

49

156

2K

Pedro Cuenca

@pcuenq

28 days

Fastest and easiest way to download a repo from the HF Hub: $ uvx hf download mlx-community/granite-4.0-h-1b-8bit That's it. Same for uploads, no excuse not to upload your datasets and models. Under the hood: Xet, parallel downloads, huggingface_hub, and of course uv.

0

3

27

clem 🤗

@ClementDelangue

27 days

The AI frontier is open-source!

Kimi.ai

@Kimi_Moonshot

27 days

🚀 Hello, Kimi K2 Thinking! The Open-Source Thinking Agent Model is here. 🔹 SOTA on HLE (44.9%) and BrowseComp (60.2%) 🔹 Executes up to 200 – 300 sequential tool calls without human interference 🔹 Excels in reasoning, agentic search, and coding 🔹 256K context window Built

19

59

841

Ben Burtenshaw

@ben_burtenshaw

28 days

you can now push, run, and pull agentic environments from the hugging face hub as spaces. the workflow looks like this: - do `openenv init` to start a new environment from a template - build/port your rl environment - use `openenv push` to push the environment to hub. - from

4

14

96

Daniel Vila Suero

@dvilasuero

28 days

Has anyone played with Petri (by @AnthropicAI) to test open model behaviours? Would love to chat

0

1

Lewis Tunstall

@_lewtun

29 days

Or ... you could just host them on https://t.co/aeVYPxcibJ

Anthropic

@AnthropicAI

29 days

Even when new AI models bring clear improvements in capabilities, deprecating the older generations comes with downsides. An update on how we’re thinking about these costs, and some of the early steps we’re taking to mitigate them:

4

17

204

Daniel Vila Suero

@dvilasuero

30 days

If you're interested in agentic RL training, don't miss this awesome guide 👇

Ben Burtenshaw

@ben_burtenshaw

30 days

New guide on RL for agentic environments. This guide integrates OpenEnv, textarena, and TRL for training language models on reasoning games like wordle. Instead of relying only on static reward functions, you can now hook up your model to interactive environments (browsers,

0

1