Explore tweets tagged as #Data_Curation
🎉 Excited to launch this challenge! 🎉 Over a year of data collection, curation, and annotation that we undertook to produce a first-of-its-kind dataset. Help us build speech models that understand 2-5 year olds. $120k in prizes and huge impact! https://t.co/8s7SAEXtZ5
0
1
1
In our recent fireside chat, John McDonald of Teleflex highlights how collaboration across data, analytics, and business stakeholders builds stronger, regulator-ready evidence. Quality data curation = regulatory strength.
0
0
0
Gm and happy New Year eve friends Today, most maps are static. They are snapshots of past moments. We need a living map. I checked out the dagama_world Post & Earn model, it rewards constant curation. It keeps the data fresh. It is a breathing ecosystem we can thrive with!
3
14
14
Last week was my last day at Hugging Face 🤗. I’m leaving with a heart full of gratitude for the friends and colleagues I’ve met here ❤️ I’m incredibly proud of what we built together: • FineWeb: The dataset that sparked a new wave of interest in data curation •
7
1
70
Time to propel open LLM training data curation to the next level. Releasing propella-1: small multilingual LLMs that annotate text documents for dataset curation at scale. 🧵👇
1
1
8
At the ‘Conclave on AI for Science’, Prof. Ajit Kembhavi of the Inter-University Centre for Astronomy and Astrophysics shared his perspective on how humankind’s approaches to data curation and analysis have evolved since the 1980s, with AI emerging as the latest frontier of
0
2
6
Great data curation isn't just for training! We @datologyai just released DatBench, a refined VLM eval suite with a simple motivation: VLM evals are broken. VLM evals are noisy, often measure the wrong thing, and expensive, often consuming ~20% of train compute. No longer!
5
17
119
Fake reviews and influencer bias have broken global discovery. @dagama_world reverses that by tying every recommendation to verifiable actions, not clout. A community-governed curation layer rewards accuracy and as data compounds, the map self-corrects.
0
23
24
Machine-assisted curation requires guardrails to prevent feedback loops from reinforcing biased data. Separation between training signals and reward distribution allows @dagama_world to preserve model neutrality over time. $DGMA
0
30
33
Discovering ionic liquids beyond the known: conditional generation meets clever data curation Ionic liquids are salts that melt below 100°C—sometimes well below room temperature. This simple property unlocks remarkable applications: greener solvents, better batteries, CO₂
2
5
30
In the current digital economy, your location data is a silent commodity sold to the highest bidder, while your favorite spots are buried under paid ads. @dagama_world is shifting this power dynamic. By merging blockchain transparency with AI curation, we are reclaiming the map
0
17
19
📄 New paper published in Microbial Genomics: The MPox Contextual Data Specification Package — a data curation toolkit supporting collaborative pathogen genomic surveillance and consistent contextual metadata. 🔗 Read here: https://t.co/J1S1hzS3H2
#Mpox #GenomicSurveillance
0
0
5
Dominant location platforms operate as natural monopolies on data exhaust, distorting allocation in real-world commerce through opaque curation. @dagama_world 's current Season 2 architecture on Galxe fragments this monopoly by making presence data composable and
0
14
15
What Matters in Data Curation for Multimodal Reasoning? Insights from the DCVLR Challenge Yosub Shin, Michael Buriek, Boris Sobolev, Pavel Bushuyeu, Vikas Kumar, Haoyang Xu, Samuel Watson, Igor Molybog https://t.co/iUvwfb7AoP [𝚌𝚜.𝙰𝙸]
0
0
0
тоp 5 ai narratives in 2026 1 prediction markets > ai reality probabilities outcomes 2 ai agents > autonomous execution trading defi 3 ai infra > compute inference data layers 4 ai infofi/socialfi > attention signals curation trust 5 ai desci > science with ai reaches I’m
14
3
32
ToolGym: an Open-world Tool-using Environment for Scalable Agent Testing and Data Curation Ziqiao Xi, Shuang Liang, Qi Liu, Jiaqing Zhang, Letian Peng, Fang Nan, Meshal Nayim, Tianhui Zhang, Rishika Mundada, Lianhui Qin, Biwei Huang, Kun Zhou https://t.co/pxiy2PdmOP [𝚌𝚜.𝙰𝙸]
0
0
0
Generative AI not only relies on scraping high quality copyrighted works it also relies on data curation to reformat, augment or label the data at scale. That requires a lot of human labor which AI companies are exploiting with potential human rights violation.
@ChombaBupe Are the traumatized workers in the room with you? So convenient that things are happening, but they’re invisible, so we have to take your word for it, and we can’t question your motives.
33
709
2K