racheldias Profile
racheldias

@rachelds__

Followers
36
Following
57
Media
0
Statuses
44

Joined May 2024
Don't wanna be here? Send us removal request.
@emollick
Ethan Mollick
2 months
After reading it, this does seem like a big deal Industry experts outlined important, real-world, hard tasks for AI to do. Other experts were asked to do the tasks themselves & yet others graded human & AI output Models approached parity with humans & AI is getting better fast.
36
138
1K
@patrickrchao
Patrick Chao
2 months
This is one of the craziest graphs I've ever seen! AI Models went from dragging humans down (gpt-4o) → to breaking past the human baseline gpt-5 delivers ~1.6× efficiency in both speed and cost 📈
@OpenAI
OpenAI
2 months
Today we’re introducing GDPval, a new evaluation that measures AI on real-world, economically valuable tasks. Evals ground progress in evidence instead of speculation and help track how AI improves at the kind of work that matters most. https://t.co/uKPPDldVNS
0
1
7
@sama
Sam Altman
2 months
very important work on a new eval
@tejalpatwardhan
Tejal Patwardhan
2 months
Understanding the capabilities of AI models is important to me. To forecast how AI models might affect labor, we need methods to measure their real-world work abilities. That’s why we created GDPval.
295
219
2K
@tejalpatwardhan
Tejal Patwardhan
2 months
@AchyutaBot wake up chat new eval just dropped
1
1
4
@gracejkim9
Grace Kim
2 months
it’s wild how incredible olivia is tbh 👀
@OliviaGWatkins2
Olivia Grace Watkins
2 months
It’s wild how much peoples’ AI progress forecasts differ even a few years out. We need hard, realistic evals to bridge the gap with concrete evidence and measurable trends. Excited to share GDPval, an eval measuring performance on real, economically valuable white-collar tasks!
3
2
15
@tejalpatwardhan
Tejal Patwardhan
2 months
@simonpfish when he accidentally downloaded 1K GDPval tasks onto his local machine last night. he built an entire eval site for you! evals [dot] openai [dot] com
5
4
120
@OliviaGWatkins2
Olivia Grace Watkins
2 months
It’s wild how much peoples’ AI progress forecasts differ even a few years out. We need hard, realistic evals to bridge the gap with concrete evidence and measurable trends. Excited to share GDPval, an eval measuring performance on real, economically valuable white-collar tasks!
@tejalpatwardhan
Tejal Patwardhan
2 months
Understanding the capabilities of AI models is important to me. To forecast how AI models might affect labor, we need methods to measure their real-world work abilities. That’s why we created GDPval.
3
4
22
@tejalpatwardhan
Tejal Patwardhan
2 months
this plot is wild right??
@tejalpatwardhan
Tejal Patwardhan
2 months
We also find that, when paired with human oversight, models have the potential to complete work tasks much faster and cheaper than humans alone.
16
14
253
@kliu128
Kevin Liu
2 months
GDPval tests models on 1,320 well specified tasks from 44 real knowledge work occupations, written by experts with an average of 14 years of experience in their field.
@tejalpatwardhan
Tejal Patwardhan
2 months
Understanding the capabilities of AI models is important to me. To forecast how AI models might affect labor, we need methods to measure their real-world work abilities. That’s why we created GDPval.
1
2
12
@phoebethacker
Phoebe Thacker
2 months
Super excited to share what the team has been cooking... Introducing GDPval, a new evaluation that measures AI on real-world, economically valuable tasks. https://t.co/8MTK5bLJVX
Tweet card summary image
openai.com
We’re introducing GDPval, a new evaluation that measures model performance on economically valuable, real-world tasks across 44 occupations.
1
1
22
@michelelwang
Michele Wang
2 months
@gracejkim9 DREAM TEAMMMMM!!! ❣️❣️❣️
0
1
2
@gracejkim9
Grace Kim
2 months
@michelelwang DREAM FREAKING TEAM ‼️‼️‼️
1
1
4
@rachelds__
racheldias
2 months
Dare I quote @gracejkim9 and say that this is in fact HUUUGGGEEE for the program! Dream team doing important work !
@tejalpatwardhan
Tejal Patwardhan
2 months
Understanding the capabilities of AI models is important to me. To forecast how AI models might affect labor, we need methods to measure their real-world work abilities. That’s why we created GDPval.
2
1
11
@michelelwang
Michele Wang
2 months
so excited for GDPval 🚀 our team's first eval measuring frontier models not just on raw intelligence, but on their ability to deliver real professional work across 44 jobs: covering Excel spreadsheets, docs, PDFs, audio and video files, CAD, and more!
@tejalpatwardhan
Tejal Patwardhan
2 months
Understanding the capabilities of AI models is important to me. To forecast how AI models might affect labor, we need methods to measure their real-world work abilities. That’s why we created GDPval.
4
8
47
@gracejkim9
Grace Kim
2 months
our team’s home on the internet!!
@simonpfish
Simón
2 months
Go to https://t.co/ggydl5W0C1 and see all the awesome work our team has been getting out there!
0
1
22
@michelelwang
Michele Wang
2 months
@gracejkim9 DREAM TEAM!!!!!
0
1
5
@gracejkim9
Grace Kim
2 months
so excited to release gdpval today with the most incredible team!!
@tejalpatwardhan
Tejal Patwardhan
2 months
Understanding the capabilities of AI models is important to me. To forecast how AI models might affect labor, we need methods to measure their real-world work abilities. That’s why we created GDPval.
3
2
19
@simonpfish
Simón
2 months
Go to https://t.co/ggydl5W0C1 and see all the awesome work our team has been getting out there!
7
11
136
@kevinweil
Kevin Weil 🇺🇸
2 months
@tejalpatwardhan See @tejalpatwardhan's post for a great in-depth look at GDPval:
@tejalpatwardhan
Tejal Patwardhan
2 months
Understanding the capabilities of AI models is important to me. To forecast how AI models might affect labor, we need methods to measure their real-world work abilities. That’s why we created GDPval.
0
1
23
@kevinweil
Kevin Weil 🇺🇸
2 months
💥 Announcing GDPval, a new eval that measures model performance on economically valuable, real-world tasks across 44 occupations.
9
18
366