Andrea Soria Jimenez Profile
Andrea Soria Jimenez

@andrejanysa

Followers
164
Following
770
Media
34
Statuses
220

Software Engineer @huggingface 🤗

Bolivia
Joined September 2013
Don't wanna be here? Send us removal request.
@andrejanysa
Andrea Soria Jimenez
1 month
RT @mervenoyann: Dataset Viewer for PDFs just landed on @huggingface 🤗. check all the document datasets on Hub🤝
Tweet media one
0
13
0
@andrejanysa
Andrea Soria Jimenez
1 month
📄 New on Hugging Face Hub: native PDF dataset support!. You can now render PDFs directly in the Dataset Viewer — with thumbnails, in-browser previews, and full integration with datasets + pdfplumber. Perfect for document-based ML workflows →.
Tweet media one
Tweet media two
Tweet media three
1
4
6
@andrejanysa
Andrea Soria Jimenez
6 months
🚀 Synthetic data is revolutionizing AI & ML!.DataDreamer, an open-source Python library, makes generating synthetic data seamless & integrates effortlessly with @huggingface . Easily push datasets to the Hub and share them with the community.🔍 Learn how:
Tweet media one
1
10
28
@andrejanysa
Andrea Soria Jimenez
6 months
RT @vanstriendaniel: You only need a few extra lines to write generated datasets directly to the @huggingface Hub.
Tweet media one
0
5
0
@andrejanysa
Andrea Soria Jimenez
7 months
RT @lhoestq: Hugging Face is now officially in the pandas Ecosystem page 🎉.Let me know what you'd like to see next for HF + pandas https://….
0
29
0
@andrejanysa
Andrea Soria Jimenez
7 months
@huggingface 📚 Quick Tutorial:
Tweet card summary image
huggingface.co
0
1
10
@andrejanysa
Andrea Soria Jimenez
7 months
Synthetic data generation has never been easier! 🎉.Generate structured output effortlessly with #fastdata and @huggingface 🚀.Steps:.1️⃣ Define your schema 📝.2️⃣ Add a generation prompt 💡.3️⃣ Input your data 🔄.4️⃣ Share it freely on Hugging Face 🌍
Tweet media one
Tweet media two
3
15
123
@andrejanysa
Andrea Soria Jimenez
7 months
RT @lhoestq: Damn this is cool. Semantic operations for pandas dataframes using open models from @huggingface. Brought to you by @lianapate….
0
9
0
@andrejanysa
Andrea Soria Jimenez
8 months
RT @lhoestq: 🤗 Datasets 3.2 is out !. With faster Parquet streaming (up to +100% speed) and faster filtering via predicate pushdown ⚡. Exam….
0
13
0
@andrejanysa
Andrea Soria Jimenez
8 months
RT @lhoestq: Things are getting interesting 🤗✨👀
Tweet media one
0
1
0
@andrejanysa
Andrea Soria Jimenez
8 months
RT @calebfahlgren: The amazing, new Qwen2.5-Coder 32B model can now write SQL for any @huggingface dataset ✨
0
40
0
@andrejanysa
Andrea Soria Jimenez
8 months
💡 Pro Tip: With Incremental Uploads, fastdata can automatically push updates to the Hub every N minutes, making it perfect for large-scale synthetic data projects.
0
0
2
@andrejanysa
Andrea Soria Jimenez
8 months
✨ How it works:.1️⃣ Define your output schema 📜.2️⃣ Craft your data generation prompt 🛠️.3️⃣ Prepare your inputs 🎯.4️⃣ Generate and push to Hugging Face Hub directly 🚀.
2
0
2
@andrejanysa
Andrea Soria Jimenez
8 months
🚀 Fastdata (by @answerdotai) + @huggingface: Synthetic Data Made Simple! 🤖📊. Generate data for deep learning 📜🛠️🎯 and push it directly to Hugging Face Hub 🌐. With Incremental Uploads, fastdata handles large-scale projects effortlessly!
Tweet media one
1
5
17
@andrejanysa
Andrea Soria Jimenez
10 months
RT @qlhoest: My new app is out !!.✨The Common Crawl Pipeline Creator ✨. Create your pipeline easily:. ✔Run Text Extraction✂️.✔Define Langua….
0
25
0
@andrejanysa
Andrea Soria Jimenez
10 months
RT @SomosNLP_: 🔥 Presentamos #LaLeadeboard, la primera leaderboard open-source para evaluar automáticamente #LLM en las variedades del espa….
0
77
0
@andrejanysa
Andrea Soria Jimenez
10 months
RT @clefourrier: There is now an LLM Leaderboard for one of the most spoken language worldwide: Spanish! 🚀.(+ Catalan, Basque and Galician)….
Tweet card summary image
huggingface.co
0
21
0
@andrejanysa
Andrea Soria Jimenez
10 months
RT @qlhoest: No one noticed this amazing new feature in @duckdb . The new read_json() + getvariable() combination lets you call #APIs in #S….
0
22
0