skrub_data Profile Banner
skrub Profile
skrub

@skrub_data

Followers
298
Following
20
Media
26
Statuses
67

Prepping tables for machine learning

Joined April 2023
Don't wanna be here? Send us removal request.
@skrub_data
skrub
20 days
RT @probabl_ai: With skore v0.10, you now have a data accessor in the EstimatorReport! It consists in a @skrub_data TableReport that allows….
0
3
0
@skrub_data
skrub
27 days
RT @probabl_ai: (Re)-watch our session at @PyData Milan in March 2025 where we discussed the latest developments in the @scikit_learn ecosy….
0
7
0
@skrub_data
skrub
27 days
RT @probabl_ai: @PyData @scikit_learn @skrub_data Timeline:.0:00: Intro of PyData Milan.7:30: Presentations of speakers.9:25: What scikit-l….
0
3
0
@skrub_data
skrub
4 months
RT @probabl_ai: 🎤 Next week, our product engineer Marie Sacksick will be presenting how to extend scikit-learn with skore, but also with sk….
Tweet card summary image
meetup.com
Dear PyLadies [💚](https://emojipedia.org/green-heart/)🐍 Our next **on-site** event is coming on the 20th of May featuring 𓆙 **Sarah Abderemane** from **Kraken** and **M
0
3
0
@skrub_data
skrub
5 months
RT @probabl_ai: For this recipe, you will need: .- 4 open source libraries,.- 3 vibrant colors,.- 2 enthusiastic speakers,.- 1 welcoming ho….
0
5
0
@skrub_data
skrub
5 months
RT @probabl_ai: @scikit_learn @skrub_data @glemaitre58 @MarieSacksick Thank you Luca Baggi for the invitation at PyData Milan!.Check the fu….
0
2
0
@skrub_data
skrub
7 months
🎉⚡️Release 0.5.1:. ◼ Encode strings faster and better with StringEncoder! . StringEncoder applies a tf-idf vectorization followed by SVD to produce high quality and FAST embeddings of textual and categorical features.
Tweet media one
0
2
11
@skrub_data
skrub
9 months
There is much more:.skrub.patch_display() adds the TableReport as a default representation for all dataframes. skrub.column_association to check which columns are linked. Check out the changelog:. 5/5
Tweet media one
0
1
4
@skrub_data
skrub
9 months
Improved TableReport:.◼ tighter layout.◼ support any script (any alphabet حب माया) in the plots.◼ robust to outliers. It works without dependencies, in any html-based environment (@ProjectJupyter, @code, a simple web page. ). Check it out on 4/5
Tweet media one
1
2
6
@skrub_data
skrub
9 months
Skrub can now easily drop columns with too many missing values. As always the TableVectorizer is very handy for preparation of data-frames, and it now comes with an option to drop those pesky columns.
Tweet media one
1
1
3
@skrub_data
skrub
9 months
Easily combine deep learning (language models on @huggingface) for text entries with @scikit_learn gradient-boosted trees. for pipelines that predict great on dataframes of mixed types. Skrub ensure the language model is downloaded, cached, picklable, everything for easy ops 2/5
Tweet media one
1
1
4
@skrub_data
skrub
9 months
🎉⚡️Release 0.4:.◼ Easily use deep learning for text entries.◼ TableVectorizer can remove columns with too many missing values.◼ TableReport more robust and prettier. 1/5
Tweet media one
1
7
16
@skrub_data
skrub
9 months
RT @probabl_ai: Some ensemble models do not support sparse features, but there is a hashing trick (via the MinHashEncoder in skrub!) that t….
0
2
0
@skrub_data
skrub
9 months
0
0
0
@skrub_data
skrub
9 months
Skrub is on bluesky 🦋. It's fun there.
1
1
4
@skrub_data
skrub
10 months
RT @MLJARofficial: @skrub_data is amazing for data exploration 😍
Tweet media one
0
6
0
@skrub_data
skrub
10 months
Gaël is an old fart, but he share some good'oll tricks worth watching. and Skrub is in there.
@GaelVaroquaux
Gael Varoquaux 🦋
10 months
Less data wrangling, more machine learning!. Watch the talk @dotConferences: 20mn of the science, but entertaining (you'll tell me). .9/9.
0
0
3
@skrub_data
skrub
10 months
RT @GaelVaroquaux: Less data wrangling, more machine learning!. Watch the talk @dotConferences: 20mn of the science, but entertaining (you'….
0
7
0
@skrub_data
skrub
10 months
RT @GaelVaroquaux: @skrub_data provides more to facilitate data wrangling. The TableReport is an interactive datafrale explorer. We're even….
0
2
0
@skrub_data
skrub
10 months
RT @GaelVaroquaux: For string columns, @skrub_data can use sub-string modeling to find latent categories. Soon, it….
0
1
0