Hamidah Oderinwale @didaoh X Profile

Hamidah Oderinwale

@didaoh

Followers

669

Following

4K

Media

20

Statuses

220

occasionally fails captchas. @vana @ifp @reboot_hq.

https://t.co/2lObROLgSC

montreal

Joined March 2023

Don't wanna be here? Send us removal request.

Hamidah Oderinwale

@didaoh

4 months

1/ With @BenDLaufer and Jon Kleinberg, we constructed the largest dataset of its kind to date: 1.86M Hugging Face models. In a new paper, we mapped how the open-source AI ecosystem evolves by tracing fine-tunes, merges, and more. Here's what we found 🧵

8

32

221

Hamidah Oderinwale

@didaoh

14 days

We’ll be presenting this paper at the NeurIPS RegML workshop! Looking forward to meeting people in town this week

Hamidah Oderinwale

@didaoh

4 months

1/ With @BenDLaufer and Jon Kleinberg, we constructed the largest dataset of its kind to date: 1.86M Hugging Face models. In a new paper, we mapped how the open-source AI ecosystem evolves by tracing fine-tunes, merges, and more. Here's what we found 🧵

0

1

10

Saeejith Nair

@sighjith

1 month

I built https://t.co/VJ7oHdvf7s, a new interface for arXiv As we enter an era of accelerated scientific discovery, we need better tools that augment human cognition to help us keep up. Try it: visit papiers ai or swap arxiv -> papiers on any paper URL

Andrej Karpathy

@karpathy

5 months

I often rant about how 99% of attention is about to be LLM attention instead of human attention. What does a research paper look like for an LLM instead of a human? It’s definitely not a pdf. There is huge space for an extremely valuable “research app” that figures this out.

105

215

2K

Tao Burga

@taoburr

1 month

How can we incentivize AI for national priorities? @sebkrier and @zhengdongwang explain how commissioning the creation of benchmarks and evaluations can steer AI development toward important outcomes that will otherwise be neglected. Benchmarks can have a large impact: When

Tao Burga

@taoburr

1 month

🚀The Launch Sequence book debut is in 11 days! Start the countdown: every day until then, I’ll post at least one short summary on each of the ideas in the book. Then we’ll start shipping the books to Congress. More details in-thread (1/3).

2

14

52

Anna Kazlauskas

@annakaz

2 months

New paper: A Research Agenda for the Economics of AI Training Data We explore: - Why data is hard to price (nonrival, heterogeneous) - How markets evolved for previous assets - Recent AI training data deals - Data as distinct from labor or capital in the production function

Open Data Labs

@open_data_labs

2 months

Our new paper written by @didaoh and @annakaz outlines research priorities for AI data economics.

10

6

46

Hamidah Oderinwale

@didaoh

2 months

5/ Blog post: https://t.co/HtFlQ2uXvM Full paper: https://t.co/ICArYm71Bw We’re excited to tackle these questions from both engineering and theory. Reach out if you're exploring similar ideas!

arxiv.org

Despite data's central role in AI production, it remains the least understood input. As AI labs exhaust public data and turn to proprietary sources, with deals reaching hundreds of millions of...

0

5

Hamidah Oderinwale

@didaoh

2 months

4/ We can already see precursors to standardization such as model cards, dataset documentation, and provenance. Turning those into the basis for exchange will require shared metrics of value, reliability, and contribution.

1

0

4

Hamidah Oderinwale

@didaoh

2 months

3/ Across these units, we observe five mechanisms in today’s landscape: Per-unit licensing, aggregate access deals, service-based pricing, commissioning, and open commons. Each structure values differently, and all fall short of full value capture. Designing new mechanisms to

1

0

2

Hamidah Oderinwale

@didaoh

2 months

2/ Markets emerged once grading, verification, and pricing systems made assets tradable. We start by defining what’s actually being traded. From tokens to datasets to corpora, each represents a different level of composition, control, and pricing logic. This hierarchy gives

1

0

2

Hamidah Oderinwale

@didaoh

2 months

1/ Grain, oil, and equity began as messy local exchanges before standards were set. We’ve yet to see this for training data. Today’s data deals remain opaque and ad hoc, from News Corp with OpenAI ($250M+), to Reddit with Google ($60M per year), to HarperCollins with Microsoft

Open Data Labs

@open_data_labs

2 months

Our new paper written by @didaoh and @annakaz outlines research priorities for AI data economics.

6

4

24

Violet

@buxwal

2 months

1/ OPT OBSERVATORY I’ve spent the past year creating *the most in-depth public resource* on how the US retains international students after they graduate. Today, @IFP is releasing never-before-seen data we obtained from ICE via FOIA. Check it out: https://t.co/La9FD8zN2j

21

99

356

Anna Kazlauskas

@annakaz

3 months

The first self-serve platform for user-owned data! Vana Playground is live. Explore structured datasets before running privacy-preserving jobs, accessing data that's usually locked behind walled gardens

vana

@vana

3 months

Introducing Vana Playground. A self-serve way to explore Vana's datasets. From the beginning, we’ve been laser-focused on building valuable datasets and commercializing them through our networks. This is the evolution: allowing anyone to see and use the data on Vana.

21

10

80

Hamidah Oderinwale

@didaoh

4 months

Thank you to @cosmos_inst and @TheFIREorg for their support! Looking forward to working on research tooling and more in the months ahead, started at @southpkcommons :)

Brendan McCord 🏛️ x 🤖

@mbrendan1

4 months

Announcing the first cohort of AI x Truth-Seeking grant recipients: Proud to build this partnership between @cosmos_inst and @TheFIREorg. From 300+ strong applications, we chose 27 builders to pilot approaches for AI to strengthen open inquiry and intellectual freedom. Our

2

31

Vincent Weisser

@vincentweisser

4 months

Super excited to power this $1m truth seeking AI grant initiative by @cosmos_inst and @TheFIREorg with @primeintellect compute - apply now for the next cohort 🫡

Brendan McCord 🏛️ x 🤖

@mbrendan1

4 months

Announcing the first cohort of AI x Truth-Seeking grant recipients: Proud to build this partnership between @cosmos_inst and @TheFIREorg. From 300+ strong applications, we chose 27 builders to pilot approaches for AI to strengthen open inquiry and intellectual freedom. Our

4

11

85

Wissam Antoun

@wissam_antoun

4 months

Surprised to see our (@fadybaly) Arabic BERT model from 4 years ago as the TOP 10 most finetuned model on the @huggingface hub. It now has ~9M total downloads, with ~600K monthly. Thread/Paper: https://t.co/5CNnj2fUWd

clem 🤗

@ClementDelangue

4 months

Fun to think about open-source models and their variants as families from an evolutionary biology standpoint and analyze "genetic similarity and mutation of traits over model families". These are the 2,500th, 250th, 50th and 25th largest families on @huggingface:

3

6

23

Shayne Longpre

@ShayneRedford

4 months

Really incredible work by @BenDLaufer and @didaoh, understanding ecosystem-wide shifts in AI! These model relationship graphs remind me of the social media network analysis field. There are so many evolving, branching uses of AI systems most ppl don't realize.

Hamidah Oderinwale

@didaoh

4 months

1/ With @BenDLaufer and Jon Kleinberg, we constructed the largest dataset of its kind to date: 1.86M Hugging Face models. In a new paper, we mapped how the open-source AI ecosystem evolves by tracing fine-tunes, merges, and more. Here's what we found 🧵

2

3

17

Benjamin Laufer

@BenDLaufer

4 months

1/10. In a new paper with @didaoh and Jon Kleinberg, we mapped the family trees of 1.86 million AI models on Hugging Face — the largest open-model ecosystem in the world. AI evolution looks kind of like biology, but with some strange twists. 🧬🤖

4

9

51

Hamidah Oderinwale

@didaoh

4 months

Excited to see this out! Great thread from Ben on the dataset and the ecological analogies he developed for this project :)

Benjamin Laufer

@BenDLaufer

4 months

1/10. In a new paper with @didaoh and Jon Kleinberg, we mapped the family trees of 1.86 million AI models on Hugging Face — the largest open-model ecosystem in the world. AI evolution looks kind of like biology, but with some strange twists. 🧬🤖

0

1

7

clem 🤗

@ClementDelangue

4 months

Fun to think about open-source models and their variants as families from an evolutionary biology standpoint and analyze "genetic similarity and mutation of traits over model families". These are the 2,500th, 250th, 50th and 25th largest families on @huggingface:

Hamidah Oderinwale

@didaoh

4 months

1/ With @BenDLaufer and Jon Kleinberg, we constructed the largest dataset of its kind to date: 1.86M Hugging Face models. In a new paper, we mapped how the open-source AI ecosystem evolves by tracing fine-tunes, merges, and more. Here's what we found 🧵

10

19

107

Gopal

@gopalkraman

4 months

super interesting research by SPC member @didaoh and @BenDLaufer

Hamidah Oderinwale

@didaoh

4 months

1/ With @BenDLaufer and Jon Kleinberg, we constructed the largest dataset of its kind to date: 1.86M Hugging Face models. In a new paper, we mapped how the open-source AI ecosystem evolves by tracing fine-tunes, merges, and more. Here's what we found 🧵

0

2

4