Argilla @argilla_io X Profile

Argilla

@argilla_io

Followers

4K

Following

2K

Media

318

Statuses

2K

Making AI data go brrrr (acquired by 🤗 Hugging Face)

World

Joined August 2021

Don't wanna be here? Send us removal request.

Argilla

@argilla_io

1 year

Today is a big day for Argilla and the Open Source AI community: . We’re joining @huggingface🤗 ! . Time to double down on community and data-centric, open source AI. Read all about it and ask questions: .

huggingface.co

9

57

224

Argilla

@argilla_io

1 day

RT @dvilasuero: dataset:.aisheets:.

huggingface.co

0

1

0

Argilla

@argilla_io

1 day

RT @dvilasuero: Turns out that @Kimi_Moonshot K2 is pretty impressive for creating electronic music in the browser!. Vibe tested with @hugg….

0

1

0

Argilla

@argilla_io

7 days

RT @dvilasuero: Fighting with prompts for data augmentation is a thing of the past. Here's a workflow where @huggingface aisheets handles….

0

1

0

Argilla

@argilla_io

4 months

RT @dvilasuero: Open Source AI vibes on the Hugging Face Hub. I'm building a small vibe benchmark you can run with @huggingface Inference P….

0

2

0

Argilla

@argilla_io

6 months

RT @davidberenstei: 🔥 The synthetic data for SmolLM and open DeepSeek-R1 relies on this awesome package!. 1.2K distilabel datasets on the H….

0

4

0

Argilla

@argilla_io

6 months

RT @sdiazlor: 💫 Generate RAG data with the Synthetic Data Generator to improve your RAG system!. 1️⃣ Generate from your documents, dataset,….

0

7

0

Argilla

@argilla_io

6 months

RT @davidberenstei: You can now use the "Synthetic Data Generator" at a much larger scale with your preferred inference engine: Ollama, vLL….

0

11

0

Argilla

@argilla_io

6 months

RT @vanstriendaniel: 🎉 50,000+ annotations reached! The FineWeb2-C community is helping build better language models on annotation at a tim….

0

10

0

Argilla

@argilla_io

6 months

RT @davidberenstei: Smol-scale vector search doesn't need a dedicated vector database!. You can simply use the Hugging Face Hub. code: htt….

0

24

0

Argilla

@argilla_io

7 months

RT @davidberenstei: High-quality data for fine-tuning language models for free and at the click of a button!. Prompt and wait for your data….

0

39

0

Argilla

@argilla_io

7 months

RT @davidberenstei: New Year's resolutions, 1) get better at AI, 2) train more models, 3) work with smaller models, 4) save some money, 5)….

github.com

A course on aligning smol models. Contribute to huggingface/smol-course development by creating an account on GitHub.

0

66

0

Argilla

@argilla_io

7 months

RT @vanstriendaniel: How Fine is FineWeb2? The community has evaluated the educational quality of over 1,000 examples from FineWeb 2 across….

0

6

0

Argilla

@argilla_io

7 months

RT @argilla_io: We're building FineWeb-Edu in many languages and need your help. This effort will help the Open-Source AI community close….

0

12

0

Argilla

@argilla_io

7 months

Start annotating:.

huggingface.co

0

1

5

Argilla

@argilla_io

7 months

We're building FineWeb-Edu in many languages and need your help. This effort will help the Open-Source AI community close the language gap. Assamese is 99.4% done, French needs 64 more, Tamil: 216. Can you help us reach 1,000 annotations?

3

12

48

Argilla

@argilla_io

7 months

RT @dvilasuero: From text to dataset:. 1. Copy & paste a dataset and its label descriptions. 2. Get a synthetic dataset for data augmentat….

0

3

0

Argilla

@argilla_io

7 months

RT @davidberenstei: Yes! Smoll models can beat frontier models but don’t expect miracles. Consider all costs and gains like difference per….

0

29

0

Argilla

@argilla_io

7 months

RT @davidberenstei: Fine-tuning ModernBERT for text classification using synthetic data generation. From prompt to model in 3 steps. 1 simp….

0

79

0

Argilla

@argilla_io

7 months

RT @Prolific: Creating an RLHF dataset on social reasoning using @argilla_io (@huggingface) comes next in our 12 days of studies selection.….

0

3

0

Argilla

@argilla_io

7 months

RT @vanstriendaniel: Introducing FineWeb-C 🌐🎓, a community-built dataset for improving language models in ALL languages. Inspired by FineW….

0

29

0