Explore tweets tagged as #Data_Curation
@pjbull
Peter Bull
2 hours
🎉 Excited to launch this challenge! 🎉 Over a year of data collection, curation, and annotation that we undertook to produce a first-of-its-kind dataset. Help us build speech models that understand 2-5 year olds. $120k in prizes and huge impact! https://t.co/8s7SAEXtZ5
0
1
1
@caresyntax
Caresyntax
23 days
In our recent fireside chat, John McDonald of Teleflex highlights how collaboration across data, analytics, and business stakeholders builds stronger, regulator-ready evidence. Quality data curation = regulatory strength.
0
0
0
@Dreytyy
Reeyhs
1 month
Gm and happy New Year eve friends Today, most maps are static. They are snapshots of past moments. We need a living map. I checked out the dagama_world Post & Earn model, it rewards constant curation. It keeps the data fresh. It is a breathing ecosystem we can thrive with!
3
14
14
@HKydlicek
Hynek Kydlíček
13 days
Last week was my last day at Hugging Face 🤗. I’m leaving with a heart full of gratitude for the friends and colleagues I’ve met here ❤️ I’m incredibly proud of what we built together: •⁠ ⁠FineWeb: The dataset that sparked a new wave of interest in data curation •⁠
7
1
70
@maxidahl
Max Idahl
21 days
Time to propel open LLM training data curation to the next level. Releasing propella-1: small multilingual LLMs that annotate text documents for dataset curation at scale. 🧵👇
1
1
8
@_DigitalIndia
Digital India
26 days
At the ‘Conclave on AI for Science’, Prof. Ajit Kembhavi of the Inter-University Centre for Astronomy and Astrophysics shared his perspective on how humankind’s approaches to data curation and analysis have evolved since the 1980s, with AI emerging as the latest frontier of
0
2
6
@DagVaultRWA
DagVault
8 days
📢 Announcement: Launching DAG Industrial. The mission: Documenting #Kaspa as the backbone of the 4IR. Our method: Human curation + Synthetic synthesis via NotebookLM. High-density data. Zero filler. The future of industrial media is here. 🎙️🤖 #KAS #BlockDAG #AI #4IR
3
19
82
@arimorcos
Ari Morcos
29 days
Great data curation isn't just for training! We @datologyai just released DatBench, a refined VLM eval suite with a simple motivation: VLM evals are broken. VLM evals are noisy, often measure the wrong thing, and expensive, often consuming ~20% of train compute. No longer!
5
17
119
@MaVoFree
Mavo
2 months
Fake reviews and influencer bias have broken global discovery. @dagama_world reverses that by tying every recommendation to verifiable actions, not clout. A community-governed curation layer rewards accuracy and as data compounds, the map self-corrects.
0
23
24
@cfcornerstone7
KING DAVID
1 month
Machine-assisted curation requires guardrails to prevent feedback loops from reinforcing biased data. Separation between training signals and reward distribution allows @dagama_world to preserve model neutrality over time. $DGMA
0
30
33
@bravo_abad
Jorge Bravo Abad
23 days
Discovering ionic liquids beyond the known: conditional generation meets clever data curation Ionic liquids are salts that melt below 100°C—sometimes well below room temperature. This simple property unlocks remarkable applications: greener solvents, better batteries, CO₂
2
5
30
@REMTECH_002
Re〽️tech
1 month
In the current digital economy, your location data is a silent commodity sold to the highest bidder, while your favorite spots are buried under paid ads. @dagama_world is shifting this power dynamic. By merging blockchain transparency with AI curation, we are reclaiming the map
0
17
19
@pha4ge
Public Health Alliance for Genomic Epidemiology
14 hours
📄 New paper published in Microbial Genomics: The MPox Contextual Data Specification Package — a data curation toolkit supporting collaborative pathogen genomic surveillance and consistent contextual metadata. 🔗 Read here: https://t.co/J1S1hzS3H2 #Mpox #GenomicSurveillance
0
0
5
@queen_cryp24988
DebCryptoQueen 👑
1 month
Dominant location platforms operate as natural monopolies on data exhaust, distorting allocation in real-world commerce through opaque curation. @dagama_world 's current Season 2 architecture on Galxe fragments this monopoly by making presence data composable and
0
14
15
@SciFi
AI Papers
16 days
What Matters in Data Curation for Multimodal Reasoning? Insights from the DCVLR Challenge Yosub Shin, Michael Buriek, Boris Sobolev, Pavel Bushuyeu, Vikas Kumar, Haoyang Xu, Samuel Watson, Igor Molybog https://t.co/iUvwfb7AoP [𝚌𝚜.𝙰𝙸]
0
0
0
@KKrucenok
Kiryl
1 month
тоp 5 ai narratives in 2026 1 prediction markets > ai reality probabilities outcomes 2 ai agents > autonomous execution trading defi 3 ai infra > compute inference data layers 4 ai infofi/socialfi > attention signals curation trust 5 ai desci > science with ai reaches I’m
14
3
32
@SciFi
AI Papers
22 days
ToolGym: an Open-world Tool-using Environment for Scalable Agent Testing and Data Curation Ziqiao Xi, Shuang Liang, Qi Liu, Jiaqing Zhang, Letian Peng, Fang Nan, Meshal Nayim, Tianhui Zhang, Rishika Mundada, Lianhui Qin, Biwei Huang, Kun Zhou https://t.co/pxiy2PdmOP [𝚌𝚜.𝙰𝙸]
0
0
0
@ChombaBupe
Chomba Bupe
2 months
Generative AI not only relies on scraping high quality copyrighted works it also relies on data curation to reformat, augment or label the data at scale. That requires a lot of human labor which AI companies are exploiting with potential human rights violation.
@JeremyTColes
Jeremy Coles
2 months
@ChombaBupe Are the traumatized workers in the room with you? So convenient that things are happening, but they’re invisible, so we have to take your word for it, and we can’t question your motives.
33
709
2K
@realsanketp
sanket patel
1 month
Blogged: Is Data Curation the New Feature Engineering?
2
1
2