Prolific
@Prolific
Followers
13K
Following
7K
Media
831
Statuses
8K
The ultimate human data platform to power world-changing AI and research. For help 👉 https://t.co/VhihEF8hXx / https://t.co/SsP4j9VdBR
London | New York City
Joined April 2014
Introducing HUMAINE: the LLM benchmark that puts real human experience first 🎯 21,352 human evaluators. 27 models. 22 demographic groups. 5 evaluation dimensions. In partnership with @huggingface. See insights below 🧵
13
3
15
We discussed: → What “good” human data looks like and how participant expertise shapes it → Behaviors we still can’t evaluate well with current rater pools → An ideal eval stack across sourcing, task design, QC, BPOs, and tooling And more. Thanks to all who joined us!
0
0
0
Last week we hosted an exclusive London dinner for leaders building the future of AI 🇬🇧 With experts from @AnthropicAI, @Salesforce, @awscloud and others, we explored how to align models with our intentions and design effective eval workflows. CEO @Phelimb opened the event.
1
0
3
AI at the Thanksgiving dinner? Here’s what the US public think 🦃 We ran a themed survey asking about AI in family traditions, part of which we used to demo rapid human data collection via Prolific at @DeepLearningAI's #AIDev25. @ana_in_sf talks through results. Happy holidays!
0
3
5
Leading human-aligned models right now: → Gemini-3-Pro, @GoogleDeepMind → Gemini-2.5-Pro, @GoogleDeepMind → DeepSeek-V3-0324, @deepseek_ai → Magistral-Medium-2506, @MistralAI Follow on @huggingface 📊 https://t.co/x1wLUxWuE0
huggingface.co
0
0
0
🔝 Gemini 3 Pro by @GoogleDeepMind has surpassed ALL frontier models on the human-centered HUMAINE benchmark for LLMs. The model scores 19.74 (+0.49 over 2.5 Pro) with a 81.9% probability of ranking #1 after repeated evaluation. It leads in 4 of 5 evaluation dimensions.
1
0
3
Huge props to all organizers, sponsors, and partners. Check out how to build, manage, and scale AI data annotation and evaluation tasks with Prolific’s AI Task Builder here ⬇️ https://t.co/ENwUMxLcIU
github.com
Example for building, managing, and scaling AI data annotation and evaluation tasks with Prolific's AI Task Builder - prolific-oss/prolific_ai_task_demo
0
0
2
We had live product demos of the Prolific API, HUMAINE benchmark, and an RLHF example via AI Task Builder. A packed booth and swag cleared out. Enjoyed chats with @Amazon, @Google, @MistralAI, and more integrating human data into their AI workflows. And great seeing @AndrewYNg!
1
0
1
Thank you #AIDev25 x NYC 🗽 It was a pleasure sponsoring the @DeepLearningAI conference in the big apple this month. 3,000+ leading developers, builders, and researchers connecting over AI innovation.
2
0
3
According to Prolific’s new “Humaine” study, users’ favorite is no longer ChatGPT but Gemini 2.5 Pro. Top 5: Gemini 2.5 Pro, DeepSeek v3, Mistral “Magistral Medium”, Grok 4 and Grok 3. In the AI race, it’s less about raw benchmarks and more about being human-friendly.
0
1
3
Scaling human eval for emotional AI shouldn’t mean sacrificing quality. One AI leader used Prolific to source 7,200+ multilingual evaluators and collect ~100k submissions in 3 months, with cultural-nuance screening and real-time QC. Full story → https://t.co/334qfzGXip
0
0
3
⚡️ @ana_in_sf is on a roll! At our second hosted SF AI meetup, @InflectionAI went "Beyond the Benchmark," digging into how human insight shapes post-training, alignment, evals, and the next wave of emotionally aware models.
0
1
2
When AI becomes less deterministic, how do we measure “goodness” if we’ve never agreed on what “good” means? Prolific's Enzo Blindow and Sara Saab joined @MLStreetTalk to explore why true alignment starts with understanding human values.
1
0
4
Just when @Wellesley's Wednesday Bushong thought their data collection was doomed by bots, FindingFive integrated with Prolific 🤝 This enabled them to collect reliable human data for speech perception research over 100x faster. 📺 Watch in full: https://t.co/6a3c0F4JY4
0
0
5
Andrew Ng kicked off AI Dev 25 x NYC by explaining why AI continues to accelerate: coding is getting faster, teams can prototype far more quickly, and the real bottleneck is now gathering user feedback. He closed by encouraging attendees to connect, collaborate, and build
11
13
106
AI Dev 25 x NYC is underway after @AndrewYNg's insightful opener. Look forward to connecting with many who are building responsibly with human data! #AIDev25
0
0
2
Guess who's arrived at #AIDev25? 🗽 Come meet us on the first floor by the main stage at the @DeepLearningAI AI Dev Conference to talk about HITL workflows.
0
1
6
November updates are live 🔥 Work faster without compromising on quality with the latest releases, from major AI Task Builder upgrades to expanded specialist participant pools. See what's new → https://t.co/xucd7ITLgM
0
1
5
See you soon New York! 🗽🍎🌃 Heading to AI Dev 25 x NYC by @DeepLearningAI with @Prolific 💙 We’ll be there showing you how to collect real human feedback for AI models in minutes, not weeks 🏃♀️💨 Are you also attending? Send me a DM, would love to meet up! 🤖💬 #AIDev25
1
4
4
🏆 The grand prize winner was MemoryX, an AI assistant turning everyday moments into actionable items. Tested with real multimodal data via a Prolific study: 1-min participant videos from diverse environments to refine it. DM @ana_in_sf to learn how we help power multimodal AI.
0
0
0