scale_AI Profile Banner
Scale AI Profile
Scale AI

@scale_AI

Followers
67K
Following
1K
Media
508
Statuses
2K

To make the best models, you need the best data.

Joined July 2016
Don't wanna be here? Send us removal request.
@scale_AI
Scale AI
3 hours
RT @calvincbzhang: New @scale_AI research in collaboration with @AnthropicAI introduces SHADE-Arena, a benchmark to test for AI sabotage. S….
0
6
0
@scale_AI
Scale AI
5 days
Proud to support American innovation, today and every day. 🇺🇸
Tweet media one
4
5
137
@scale_AI
Scale AI
13 days
@TIME Learn more:
1
2
18
@scale_AI
Scale AI
13 days
We've been at the center of nearly every AI advancement for a decade. 2025 is no different. Proud to be included on @TIME's list of the 100 Most Influential Companies.
Tweet media one
8
10
51
@scale_AI
Scale AI
13 days
View the leaderboard 🔗
0
0
7
@scale_AI
Scale AI
13 days
Developed with guidance from our expert red-teamers, FORTRESS sets a new standard for evaluating LLMs. Especially where the stakes are highest.
1
0
10
@scale_AI
Scale AI
13 days
As AI capabilities grow, so do the risks. FORTRESS is the first benchmark designed to evaluate both sides of the dual-use national security spectrum:. ⚠️ Can models be misused for harm?.🛑 Are they over-restricting safe, critical information?.
1
0
5
@scale_AI
Scale AI
13 days
Introducing FORTRESS. Our newest benchmark built to evaluate AI models where it matters most: national security and public safety.
Tweet media one
4
12
60
@scale_AI
Scale AI
16 days
RT @SeanHendryx: What will the learning environments of the future look like that train artificial super intelligence? In recent work at @s….
0
29
0
@scale_AI
Scale AI
21 days
RT @jdroege: We're just getting started.
0
7
0
@scale_AI
Scale AI
26 days
New era loading.
68
85
1K
@scale_AI
Scale AI
26 days
RT @jdroege: 🤝💜.
0
21
0
@scale_AI
Scale AI
29 days
RT @summeryue0: 🔍 SEAL and Red Team at @scale_ai present a position paper outlining what we’ve learned from red teaming LLMs so far—what ma….
0
21
0
@scale_AI
Scale AI
1 month
🚨 Gemini-2.5 Pro in preview just dropped on SEAL Leaderboards — ranked #1 on our benchmarks measuring expert reasoning and visual understanding.
Tweet media one
4
12
60
@scale_AI
Scale AI
1 month
The latest episode of Human in the Loop from Scale unpacks how to red team for real-world enterprise risk: model drift, over-restrictive guardrails, and agentic AI failures.
1
1
17
@scale_AI
Scale AI
1 month
The enterprise AI threat landscape is evolving rapidly, but with red-teaming, so can your defenses
4
5
26
@scale_AI
Scale AI
1 month
See the full results🔗
1
2
11
@scale_AI
Scale AI
1 month
DeepSeek-R1's latest upgrade is now live on SEAL Leaderboards — outperforming all other open-source models on Humanity's Last Exam (Text Only).
5
10
32
@scale_AI
Scale AI
2 months
Full episode
0
2
10