Explore tweets tagged as #Datasets
Exciting news for #QGIS users! The "Google Earth Engine Plugin for QGIS" is now updated with new no-code tools that allow you to download and use #EarthEngine datasets in QGIS easily. Check out my newly contributed tutorials for the latest plugin (1/n) 👇
15
232
1K
🌎👩🔬 For 15+ years biology has accumulated petabytes (million gigabytes) of🧬DNA sequencing data🧬 from the far reaches of our planet.🦠🍄🌵 Logan now democratizes efficient access to the world’s most comprehensive genetics dataset. Free and open. https://t.co/dDBtAjfdYL
5
146
368
Data cleaning is where true #DataScience begins From handling missing values to filtering, aggregation & merging datasets—these Python commands are your go-to for making raw data analysis-ready! #Python #Pandas #DataAnalytics #MachineLearning #BigData #EDA
6
105
511
📚 In summer 2023, my book Causal Analysis was published with @mitpress. Just two years later😉 I’m very happy to share that the lecture slides are now freely available in both PDF and LaTeX (as zip files), along with the datasets and R/Python code: 👉 https://t.co/VfahR3aqVR
11
338
2K
I just completed the Data Engineer in Python track on @DataCamp and built my first ETL pipeline for a retail dataset alongside!🥳 You can check out the project using this link: https://t.co/iuER47CGke If you're also transitioning into DE, let's connectttt☺️
18
12
162
TOP 20 Indonesian Crypto Influencers (2025) Based on the last 30 days, since @grok has a hard time processing too many datasets.
79
84
496
Excited to announce that the 1st paper from my postdoc is now out in @CurrentBiology Using a large dataset of 3D preserved fossils, we explore the diversification of jaws in early bony fishes. 1/15 https://t.co/qUteZPnrRr
3
32
160
Guysssss, I've completed making the Srimad Bhagavad Gita Dataset, do use it for making something beneficial for the mankind and suggest me any ideas that can be implemented and tell me if you find any mistake I might have made. Om Namo Bhagavate Vasudevaya 🙏🙏
91
273
3K
Today, we are releasing FineVision, a huge open-source dataset for training state-of-the-art Vision-Language Models: > 17.3M images > 24.3M samples > 88.9M turns > 9.5B answer tokens Here are my favourite findings:
19
214
1K
Introducing MultiCaRe, open-source, multimodal clinical case datasets on @HuggingFace by @OpenMed_AI Community. Public and ready for load_dataset. Images: 160K+ figures/subimages Cases: 85K de-identified narratives + demographics Articles: 85K metadata + abstracts 🧵 (1/7)
19
162
745
Free playlist of 23 hands-on Python Pandas project tutorials including e-commerce analysis, movie datasets, health data, and building web apps with Streamlit. Perfect for building a strong data analysis portfolio with real-world case studies.
2
52
341
Fuck it. Today, we open source FineVision: the finest curation of datasets for VLMs, over 200 sources! > 20% improvement across 10 benchmarks > 17M unique images > 10B answer tokens > New capabilities: GUI navigation, pointing, counting FineVision 10x’s open-source VLMs.
22
111
920
Geospatial traffic data is incredibly tough to find. To make things easier for you, I've compiled a comprehensive list of traffic and mobility datasets:
5
77
617
🚨🇲🇽 Alleged Sale of 23,000 Mexican Credit Card Records A known threat actor has allegedly listed a dataset of 23,000 credit card records from Mexico advertised with ~70% validity. 📌 Key Details • Threat Actor: Mexicnon • Network: Dark Web • Format: Fullz (CC, Exp, CVV,
2
6
18
🚨 Real footage showing AI companies trying to remove personal data from the AI training dataset to avoid GDPR compliance. Watch:
19
57
532
A Chinese-led international team has developed EyeFM, an AI system trained on 14.5 million ocular images and paired clinical texts from global, multi-ethnic datasets. As the world's first multimodal vision–language eye imaging foundation model, it has demonstrated how AI can soon
1
34
100