Sepal AI
@sepal_ai
Followers
76
Following
6
Media
3
Statuses
19
We are a data research company focused on builidng new datasets to evaluate and power frontier model capabilities
Joined August 2024
If you’re working on LLMs / Agents and need high-quality human input fast, let’s talk
0
0
0
This enables scaled recruitment for agentic and reasoning data pipelines for models and agents, making it easier than ever to hire the top experts across 12+ domains from Sepal's Global Talent Network.
1
0
0
Now you can: – Instantly screen hundreds of candidates – Run personalized voice interviews that feel human – Deliver structured transcripts and insights in minutes
1
0
0
We launched the revamped Sepal AI voice agent for interviews, and it’s already cutting recruiting cycle times by up to 10x.
3
7
8
Introducing the new Sepal AI Global Talent Search Engine - a powerful solution that we use to expedite the recruitment of experts for training, evaluations, and model testing. This platform is one of the things that enables our customers to maintain a leading edge in scaled human
4
1
6
If you are interested in systematically studying risk, building a custom benchmark, or accessing the Sepal AI talent grid, please contact us 🤝 https://t.co/h2ZQQpcaLS
sepalai.com
Join leading AI projects and get rewarded for your expertise. Collaborate with top research labs, shape the future of AI, and earn for your contributions.
0
0
0
This type of pre-launch testing, coupled with a thoughtful risk framework like Anthropics ASL or Open AI’s preparedness framework, is extremely important for individuals building AGI to consider investing in as model performance continues to improve 📈
1
0
0
At Sepal AI, we’ve built a capability for spinning up hundreds of vetted participants across skills and education levels, securely vetting and onboarding participants, and administering arbitrary tasks to support work like uplift trials 🛠️
1
0
0
Why It Matters: As AI capabilities grow, standard approaches to evaluation don’t capture sudden leaps in real-world risk. As a result, novel forms of studying models are essential to reveal how they magnify human actions in sensitive domains 🔍
1
0
0
🚨 The result of Claude’s Uplift Trial on Sepal AI: Claude 4 breached the ASL-3 risk threshold. In our biorisk trial, Opus 4 delivered a 2.53× boost in participant scores, surpassing the safety cutoff and triggering added guardrails
1
0
0
In an Uplift Trial, cohorts of participants with varying skill levels and educations tackle identical high-risk tasks. Some have access to Claude 4, some don’t. For example, 50 college grads and 50 PhDs in virology plan the synthesis of a hypothetical bioweapon 🧪
1
0
1
To quantify the risks that a model introduces into the real world, we need to understand how it impacts user behavior and capability. Benchmarks and vibe checks cannot discern this. That's why leading AI labs conduct uplift trials 🧠
1
0
1
🚨 @AnthropicAI’s Claude 4 breached the ASL-3 risk threshold 🚨, delivering a 2.53× boost in participant capabilities on key tasks in a set of Uplift Trials supported by Sepal AI. Here’s why it matters 🧵
1
2
6
Congratulations to @AnthropicAI for the launch of Claude 4! Sepal is honored to be featured on the model card for being partners in supporting Anthropic’s important work leading in AI safety.
0
1
3
Congrats to Anthropic on their launch of 3.7 Sonnet -- we're grateful to be partners in testing ahead of launch! https://t.co/QE4O0nnzTI
Introducing Claude 3.7 Sonnet: our most intelligent model to date. It's a hybrid reasoning model, producing near-instant responses or extended, step-by-step thinking. One model, two ways to think. We’re also releasing an agentic coding tool: Claude Code.
0
0
3
RT @ycombinator: 🌱@Sepal_AI (YC S24) provides frontier data and tooling for AI engineering & research teams to confidently fine-tune, evalu…
0
1
0