Recognai
@recogn_ai
Followers
368
Following
447
Media
23
Statuses
298
This account is no longer maintained. We are now Argilla, follow us @argilla_io
Madrid, España
Joined April 2016
🎉 Excited to release Selectra (Spanish Electra), a new set of models on the @huggingface Hub 3-5x times smaller than current SOTA Spanish models while achieving competitive results 🧵Overview below (1/4) Thanks @GoogleAI TPU RC for their support #python #opensource #nlproc
4
18
55
🥳 We're extremely excited to announce we're now Argilla Please don't forget to follow us @argilla_io. There are many more exciting things coming up! Read more at: https://t.co/JiCU7iSOJb
#python #opensource #nlproc
1
3
4
Find text classification label errors with @CleanlabAI and correct them with @rubrixml
https://t.co/2JVbbe8xrn
#python #opensource #NLProc
0
13
32
rubrix: ✨ Rubrix, open-source framework for data-centric NLP. Data annotation and monitoring for enterprise NLP Lang: Python ⭐️ 1351 Author: @recogn_ai
#MachineLearning
https://t.co/Utorh0TXzq
github.com
Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets - argilla-io/argilla
0
4
12
Get started with NLP with custom datasets Create and label datasets for text classification, token classification and text generation https://t.co/VlMC8wkqHB
#python #opensource #datascience
1
34
76
Don't have a lot of time to annotate data? SetFit + Rubrix, few-shot classification with custom data 🤓 https://t.co/U1FEM5zn9j
#nlproc #datascience #opensource
1
35
106
⚡ New release 0.18.0 > Better token classification validation > Delete records by id & query for better dataset management > New tutorials! Thanks to our community contributors @AnkushChander, Tom Aarsen, & others https://t.co/FmBpyHdGKk
#python #nlproc #opensource
1
3
9
SetFit: Efficient few-shot learning with Sentence Transformers So exciting! Train robust models with very few examples, fast training, fast inference, and comparable/better than other LLMs and prompt-based methods. https://t.co/sbFh3EcSK3
#python #opensource #NLProc
3
30
111
Active learning for text classification with @rubrixml and the wonderful small-text library by @webis_de Learn how to build a custom active learning loop and teach a 🤗 transformers model https://t.co/iux9l0rp7M
#python #opensource #NLProc
1
27
71
Want to analyze prediction explanations from your Transformer models? At the dataset level? A new tutorial using SHAP and Transformers interpret! https://t.co/3wP3B0MaCK
#python #opensource #xai
0
17
46
humap: Hierarchical Uniform Manifold Approximation and Projection A very cool method and library by @EstecioJunior Reduces visual burden when exploring clusters in large datasets and enables drill-down with hierarchical levels https://t.co/zx0LNUCOTh
#python #opensource #umap
github.com
Hierarchical Uniform Manifold Approximation and Projection - wilsonjr/humap
1
9
39
Rubrix: the open-source framework for data-centric NLP Build human-in-the-loop workflows for data annotation, monitoring, and review. https://t.co/DOrtloj95f Follow @rubrixml for updates #python #nlp #opensource
0
20
67
What can we learn from model predictions vs. training data labels? * Ambiguous examples * (Some) wrong labels * Model improvement patterns A reproducible example using the @stanfordnlp sentiment treebank dataset & @rubrixml
https://t.co/fMRiIw2MB9
#python #opensource #NLProc
0
5
19
Weak supervision for multilabel text classification. Get instant statistics about heuristics' coverage and precision with @rubrixml UI Define rules programmatically with Python Tutorial: https://t.co/ovPVLLcyMT
#opensource #datacentricai #python
0
13
37
Every good model starts with good quality datasets. Iteration and collaboration are key ingredients to achieve this. Here's how you can iterate on data and models using the Hugging Face Hub. https://t.co/agDomR7bnM
#nlproc #datascience #opensource
0
10
22
Weak supervision for multilabel text classification? A step by step tutorial using @rubrixml
https://t.co/ovPVLLcyMT
#python #opensource #NLProc
rubrix.readthedocs.io
In this tutorial we use Rubrix and weak supervision to tackle two multi-label classification datasets: The first dataset is a curated version of GoEmotions, a dataset intended for multi-label emoti...
0
5
17
Fine-tuning a sentiment classifier starting with no labeled data with @rubrixml
https://t.co/mBFZrqA8jw Follow @rubrixml for more resources like this one If you love NLP & open-source join our friendly community: https://t.co/O0jE08KGdX
#python #opensource #nlp #transformers
0
7
31
BOND: BERT-Assisted Open-Domain Named Entity Recognition with Distant Supervision 1️⃣ Noisy labels using Wikidata and gazetteers (distant labels) 2️⃣ Fine-tune Roberta for NER with distant labels 3️⃣ Self-training https://t.co/q17JEAWcLl
https://t.co/Cp0Kfba9WN
#python #NLProc
1
18
81
Stanza by @stanfordnlp is powerful for NER Want to see how well it performs with your data? 👇 https://t.co/fhSzDk90HP New to @rubrixml? https://t.co/DOrtloiBfH Join the community: https://t.co/O0jE08KGdX
#python #nlp #opensource #datascience
1
12
32