didaoh Profile Banner
Hamidah Oderinwale Profile
Hamidah Oderinwale

@didaoh

Followers
669
Following
4K
Media
20
Statuses
220

occasionally fails captchas. @vana @ifp @reboot_hq.

montreal
Joined March 2023
Don't wanna be here? Send us removal request.
@didaoh
Hamidah Oderinwale
4 months
1/ With @BenDLaufer and Jon Kleinberg, we constructed the largest dataset of its kind to date: 1.86M Hugging Face models. In a new paper, we mapped how the open-source AI ecosystem evolves by tracing fine-tunes, merges, and more. Here's what we found 🧵
8
32
221
@didaoh
Hamidah Oderinwale
14 days
We’ll be presenting this paper at the NeurIPS RegML workshop! Looking forward to meeting people in town this week
@didaoh
Hamidah Oderinwale
4 months
1/ With @BenDLaufer and Jon Kleinberg, we constructed the largest dataset of its kind to date: 1.86M Hugging Face models. In a new paper, we mapped how the open-source AI ecosystem evolves by tracing fine-tunes, merges, and more. Here's what we found 🧵
0
1
10
@sighjith
Saeejith Nair
1 month
I built https://t.co/VJ7oHdvf7s, a new interface for arXiv As we enter an era of accelerated scientific discovery, we need better tools that augment human cognition to help us keep up. Try it: visit papiers ai or swap arxiv -> papiers on any paper URL
@karpathy
Andrej Karpathy
5 months
I often rant about how 99% of attention is about to be LLM attention instead of human attention. What does a research paper look like for an LLM instead of a human? It’s definitely not a pdf. There is huge space for an extremely valuable “research app” that figures this out.
105
215
2K
@taoburr
Tao Burga
1 month
How can we incentivize AI for national priorities? @sebkrier and @zhengdongwang explain how commissioning the creation of benchmarks and evaluations can steer AI development toward important outcomes that will otherwise be neglected. Benchmarks can have a large impact: When
@taoburr
Tao Burga
1 month
🚀The Launch Sequence book debut is in 11 days! Start the countdown: every day until then, I’ll post at least one short summary on each of the ideas in the book. Then we’ll start shipping the books to Congress. More details in-thread (1/3).
2
14
52
@annakaz
Anna Kazlauskas
2 months
New paper: A Research Agenda for the Economics of AI Training Data We explore: - Why data is hard to price (nonrival, heterogeneous) - How markets evolved for previous assets - Recent AI training data deals - Data as distinct from labor or capital in the production function
@open_data_labs
Open Data Labs
2 months
Our new paper written by @didaoh and @annakaz outlines research priorities for AI data economics.
10
6
46
@didaoh
Hamidah Oderinwale
2 months
5/ Blog post: https://t.co/HtFlQ2uXvM Full paper: https://t.co/ICArYm71Bw We’re excited to tackle these questions from both engineering and theory. Reach out if you're exploring similar ideas!
Tweet card summary image
arxiv.org
Despite data's central role in AI production, it remains the least understood input. As AI labs exhaust public data and turn to proprietary sources, with deals reaching hundreds of millions of...
0
0
5
@didaoh
Hamidah Oderinwale
2 months
4/ We can already see precursors to standardization such as model cards, dataset documentation, and provenance. Turning those into the basis for exchange will require shared metrics of value, reliability, and contribution.
1
0
4
@didaoh
Hamidah Oderinwale
2 months
3/ Across these units, we observe five mechanisms in today’s landscape: Per-unit licensing, aggregate access deals, service-based pricing, commissioning, and open commons. Each structure values differently, and all fall short of full value capture. Designing new mechanisms to
1
0
2
@didaoh
Hamidah Oderinwale
2 months
2/ Markets emerged once grading, verification, and pricing systems made assets tradable. We start by defining what’s actually being traded. From tokens to datasets to corpora, each represents a different level of composition, control, and pricing logic. This hierarchy gives
1
0
2
@didaoh
Hamidah Oderinwale
2 months
1/ Grain, oil, and equity began as messy local exchanges before standards were set. We’ve yet to see this for training data. Today’s data deals remain opaque and ad hoc, from News Corp with OpenAI ($250M+), to Reddit with Google ($60M per year), to HarperCollins with Microsoft
@open_data_labs
Open Data Labs
2 months
Our new paper written by @didaoh and @annakaz outlines research priorities for AI data economics.
6
4
24
@buxwal
Violet
2 months
1/ OPT OBSERVATORY I’ve spent the past year creating *the most in-depth public resource* on how the US retains international students after they graduate. Today, @IFP is releasing never-before-seen data we obtained from ICE via FOIA. Check it out: https://t.co/La9FD8zN2j
21
99
356
@annakaz
Anna Kazlauskas
3 months
The first self-serve platform for user-owned data! Vana Playground is live. Explore structured datasets before running privacy-preserving jobs, accessing data that's usually locked behind walled gardens
@vana
vana
3 months
Introducing Vana Playground. A self-serve way to explore Vana's datasets. From the beginning, we’ve been laser-focused on building valuable datasets and commercializing them through our networks. This is the evolution: allowing anyone to see and use the data on Vana.
21
10
80
@didaoh
Hamidah Oderinwale
4 months
Thank you to @cosmos_inst and @TheFIREorg for their support! Looking forward to working on research tooling and more in the months ahead, started at @southpkcommons :)
@mbrendan1
Brendan McCord 🏛️ x 🤖
4 months
Announcing the first cohort of AI x Truth-Seeking grant recipients: Proud to build this partnership between @cosmos_inst and @TheFIREorg. From 300+ strong applications, we chose 27 builders to pilot approaches for AI to strengthen open inquiry and intellectual freedom. Our
2
2
31
@vincentweisser
Vincent Weisser
4 months
Super excited to power this $1m truth seeking AI grant initiative by @cosmos_inst and @TheFIREorg with @primeintellect compute - apply now for the next cohort 🫡
@mbrendan1
Brendan McCord 🏛️ x 🤖
4 months
Announcing the first cohort of AI x Truth-Seeking grant recipients: Proud to build this partnership between @cosmos_inst and @TheFIREorg. From 300+ strong applications, we chose 27 builders to pilot approaches for AI to strengthen open inquiry and intellectual freedom. Our
4
11
85
@wissam_antoun
Wissam Antoun
4 months
Surprised to see our (@fadybaly) Arabic BERT model from 4 years ago as the TOP 10 most finetuned model on the @huggingface hub. It now has ~9M total downloads, with ~600K monthly. Thread/Paper: https://t.co/5CNnj2fUWd
@ClementDelangue
clem 🤗
4 months
Fun to think about open-source models and their variants as families from an evolutionary biology standpoint and analyze "genetic similarity and mutation of traits over model families". These are the 2,500th, 250th, 50th and 25th largest families on @huggingface:
3
6
23
@ShayneRedford
Shayne Longpre
4 months
Really incredible work by @BenDLaufer and @didaoh, understanding ecosystem-wide shifts in AI! These model relationship graphs remind me of the social media network analysis field. There are so many evolving, branching uses of AI systems most ppl don't realize.
@didaoh
Hamidah Oderinwale
4 months
1/ With @BenDLaufer and Jon Kleinberg, we constructed the largest dataset of its kind to date: 1.86M Hugging Face models. In a new paper, we mapped how the open-source AI ecosystem evolves by tracing fine-tunes, merges, and more. Here's what we found 🧵
2
3
17
@BenDLaufer
Benjamin Laufer
4 months
1/10. In a new paper with @didaoh and Jon Kleinberg, we mapped the family trees of 1.86 million AI models on Hugging Face — the largest open-model ecosystem in the world. AI evolution looks kind of like biology, but with some strange twists. 🧬🤖
4
9
51
@didaoh
Hamidah Oderinwale
4 months
Excited to see this out! Great thread from Ben on the dataset and the ecological analogies he developed for this project :)
@BenDLaufer
Benjamin Laufer
4 months
1/10. In a new paper with @didaoh and Jon Kleinberg, we mapped the family trees of 1.86 million AI models on Hugging Face — the largest open-model ecosystem in the world. AI evolution looks kind of like biology, but with some strange twists. 🧬🤖
0
1
7
@ClementDelangue
clem 🤗
4 months
Fun to think about open-source models and their variants as families from an evolutionary biology standpoint and analyze "genetic similarity and mutation of traits over model families". These are the 2,500th, 250th, 50th and 25th largest families on @huggingface:
@didaoh
Hamidah Oderinwale
4 months
1/ With @BenDLaufer and Jon Kleinberg, we constructed the largest dataset of its kind to date: 1.86M Hugging Face models. In a new paper, we mapped how the open-source AI ecosystem evolves by tracing fine-tunes, merges, and more. Here's what we found 🧵
10
19
107
@gopalkraman
Gopal
4 months
super interesting research by SPC member @didaoh and @BenDLaufer
@didaoh
Hamidah Oderinwale
4 months
1/ With @BenDLaufer and Jon Kleinberg, we constructed the largest dataset of its kind to date: 1.86M Hugging Face models. In a new paper, we mapped how the open-source AI ecosystem evolves by tracing fine-tunes, merges, and more. Here's what we found 🧵
0
2
4