Explore tweets tagged as #WebDatasets
@Suhail
Suhail
2 years
Alright Internet, help me figure out this puzzle with the elusive webdatasets. Why is my .txt file not showing up as a key?!
19
0
13
@promptcloud
PromptCloud
7 years
Get access to #travel, #ecommerce, #job related #webdatasets and much more at 30% discount only on #DataStock. Coupon Code: DS30 Login/ Signup here: https://t.co/U1sx8HWb12 Hurry Up! Offer expires in two days. #data #internet #webdata #sale #offers RT
0
0
1
@im_ashishsinha5
DataSamurai
3 months
Webdatasets ftw, unbelievable throughput
0
0
0
@avikumart_
Avi Kumar Talaviya
3 years
Popular datasets from major organizations and across the categories like Social Media, eCommerce, Real-estate among others You can also request sample datasets relevant to your use case Try now by signing up using the link below👇 🔗 https://t.co/emRyWVWL6g
1
2
8
@actowizsolution
Actowiz Solutions
1 year
0
0
0
@actowizsolution
Actowiz Solutions
1 year
0
0
0
@Jenrola_odun
Jenrola_odun
2 months
After much sweat and stress I'm ready to accept that webdatasets is far from optimal for working with audio data.
1
0
1
@Suhail
Suhail
2 years
I will fund someone $25K to make webdatasets significantly better from docs to adding new features. Esp attacking things that drive devs crazy. You'd have to work on it for 20 hours/week. We will give you a list of things and you can make wds better for everyone in AI. DM.
8
10
109
@ChurchillMic
Michael Churchill
4 years
Still not clear to me on all of the magic under the hood, and how it relates to data sharding with libraries like webdatasets, in particular for datasets larger than local memory. Some additional guidance: https://t.co/6wVoa8jkgX https://t.co/ryWrgvnCB3
1
0
0
@dichika
石川木場郎
12 years
old faithful dataはこちらで http://t.co/zpZ1aJreEW #TokyoR
0
0
0
@HaoliYin
Haoli Yin
1 year
If this does what I think it does then this solves one of the most annoying problems when using WebDatasets
@StasBekman
Stas Bekman
1 year
Yay! @huggingface datasets==2.20.0 added IterableDataset checkpointing support via torchdata.stateful_dataloader.StatefulDataLoader So instead of figuring out how to rewind the DL on resume, it can now be restored from a checkpoint! This is a super-useful feature: Doc:
2
0
13
@code_star
Cody Blakeney
4 years
Has anyone played around with AIStore ore Webdatasets for PyTorch? I’m really tempted to convert some outdated servers to be AIStore nodes for my lab at Texas State. https://t.co/V7DvKODaRN https://t.co/z1B4WGXHpo
1
0
0
@jpclap
Jakub Piotr Cłapa
2 months
@TheZachMueller WebDatasets or something else?
1
0
0
@ChurchillMic
Michael Churchill
1 year
@vikhyatk I dunno, webdatasets, FFCV, Streaming from Mosaic, all of them were just too quirky to get right with DDP.
1
0
1
@lhoestq
Quentin Lhoest 🤗
2 years
There are plenty of cool WebDatasets on HF already: Imagenet-1K https://t.co/TGMAR80rX0 CC12M https://t.co/PgHtHLh3yq
1
0
5
@borisdayma
Boris Dayma 🖍️
2 years
@SanhEstPasMoi @huggingface At that resolution your images start getting a lot of space so it can be more difficult and effective to handle. You can solve the difficulty part with webdatasets or https://t.co/Y9RdbGwpzW but your storage/egress cost is roughly proportional to number of pixels...
1
0
1
@rom1504
Romain Beaumont
3 years
@a_e_roberts Cool but does it work for big datasets eg ones stored with webdatasets ?
1
0
5
@ChurchillMic
Michael Churchill
3 years
@karankjariwala @abhi_venigalla @MosaicML Thanks, I’ll try it out. Anyone tried it with Pytorch Lightning distributed training? Using Webdatasets it was kind of a headache
1
0
0
@ChurchillMic
Michael Churchill
3 months
@crisbodnar Why did they kill it? Something like webdatasets I thought doesn't fully address
1
0
0
@andimarafioti
Andi Marafioti
2 months
@girkosh You could easily restructure them as webdatasets 🫣
0
0
1