sdiazlor Profile Banner
Sara Han Profile
Sara Han

@sdiazlor

Followers
96
Following
380
Media
12
Statuses
131

DevRel & ML | 🐕🎮💪💻

Galicia
Joined February 2024
Don't wanna be here? Send us removal request.
@sdiazlor
Sara Han
23 days
RT @PrunaAI: Say hello to Sara Han, the newest member of our Developer Advocacy team!. With a laptop and his puppy on her lap, she'll help….
0
2
0
@sdiazlor
Sara Han
6 months
RT @davidberenstei: 🔥 Come and get those AI agents certificates! . Join the cohort of 66K students: .
0
3
0
@sdiazlor
Sara Han
6 months
RT @davidberenstei: 🔥 @sdiazlor Just published: How to fine-tune Deepseek-R1-Distil-Qwen with synthetic reasoning data!. DeepSeek-R1 has be….
0
3
0
@sdiazlor
Sara Han
6 months
𝙀𝙭𝙩𝙧𝙖: Stuck on where to start? Just follow our latest tutorial: 𝘍𝘪𝘯𝘦-𝘵𝘶𝘯𝘦 𝘋𝘦𝘦𝘱𝘴𝘦𝘦𝘬-𝘙1 𝘸𝘪𝘵𝘩 𝘢 𝘚𝘺𝘯𝘵𝘩𝘦𝘵𝘪𝘤 𝘙𝘦𝘢𝘴𝘰𝘯𝘪𝘯𝘨 𝘋𝘢𝘵𝘢𝘴𝘦𝘵. Check it here:
huggingface.co
0
1
6
@sdiazlor
Sara Han
6 months
✨ The Synthetic Data Generator comes with new features:. 🌱 Add seed data to build your chat dataset.🔄 Configure different models for instruction and completion.📁 Save your dataset locally. Test the Space ( or duplicate it!. Check the extra ⤵️.
huggingface.co
1
6
29
@sdiazlor
Sara Han
6 months
RT @davidberenstei: The RAG's in the bag! . You can now use the Synthetic Data Generator with your own domain-specific seed data to generat….
0
3
0
@sdiazlor
Sara Han
6 months
Start synthesizing 🚀: ✍ Blog post:
Tweet card summary image
huggingface.co
1
0
4
@sdiazlor
Sara Han
6 months
💫 Generate RAG data with the Synthetic Data Generator to improve your RAG system!. 1️⃣ Generate from your documents, dataset, or dataset description. 2️⃣ Configure it. 3️⃣ Generate the synthetic dataset. 4️⃣ Fine-tune the retrieval and reranking models. 5️⃣ Build a RAG pipeline.
1
7
32
@sdiazlor
Sara Han
7 months
🙅‍♀️ No-code end-to-end example to train your model. 1️⃣ Use the Synthetic Data Generator to create your custom dataset. 2️⃣ Use AutoTrain to use the generated dataset and train your model. Check it here:
Tweet media one
0
6
25
@sdiazlor
Sara Han
7 months
- No code required—everything can be handled through the interface. - 100% free to use. - Designed to create text classification and chat datasets. - Review in Argilla and push to the Hub.
0
0
1
@sdiazlor
Sara Han
7 months
Where do I get the data from?. We often need to fine-tune models for very specific scenarios. And that’s where the Synthetic Data Generator comes in!. Want to see how it works? Watch this quick video and get started here:
1
0
4
@sdiazlor
Sara Han
8 months
Pouco a pouco avanzamos! 🚀 Anímovos a contribuír, tan só tedes que entrar na ligazón, ler as instrucións e comezar a anotar ✍.
Tweet card summary image
data-is-better-together-fineweb-c.hf.space
Join and contribute to the dataset glg - galego - Galician
0
1
3
@sdiazlor
Sara Han
8 months
It only takes 2 steps:.- Coordinate with your Language Lead: Or become one if it is missing: - Read the guidelines and start annotating according to the educational value:
Tweet card summary image
huggingface.co
0
0
1
@sdiazlor
Sara Han
8 months
Spanish, Filipino, Amharic, French, German, Basque, Catalan, Galician, Guarani, Telugu, Italian, Pashto, Romanian, Tamil, Urdu, Danish. and many more! All included in the FineWeb2 Community Annotation Sprint! 🔥 . 💫 Join to build an impactful dataset for your language!.
1
2
3
@sdiazlor
Sara Han
8 months
Binarized dataset: Blog post:
Tweet card summary image
huggingface.co
0
0
0
@sdiazlor
Sara Han
8 months
Open Image Preferences released! 🚀 . - Open-source dataset for text2image.- 10K samples manually evaluated by the HF community. - Binarized format for SFT, DPO, or ORPO. It comes with a nice blog post explaining the steps to preprocess and generate the data and the results.
Tweet media one
1
0
2
@sdiazlor
Sara Han
8 months
RT @HamelHusain: Argilla is pretty good now. I didn’t like it before but they’ve made massive improvements and have crossed into genuinely….
Tweet card summary image
argilla.io
Argilla is a collaboration tool for AI engineers and domain experts that strive for data quality, ownership, and efficiency.
0
13
0
@sdiazlor
Sara Han
8 months
RT @cyrilzakka: Super excited to introduce Halo: A beginner's guide to DIY health tracking with wearables! 🤗✨. Using an $11 smart ring, I'l….
0
76
0
@sdiazlor
Sara Han
8 months
RT @dvilasuero: Let's go @marqo_ai team!. Here's how to start using their latest open dataset for exploration, labelling, and/or curation w….
0
5
0
@sdiazlor
Sara Han
8 months
🔍 No bias here: What three key aspects would make an annotation tool ideal for you?. Feel free to share your thoughts! I'm reading you 🤗.
0
2
4