Scale AI @scale_AI X Profile

Scale AI

@scale_AI

Followers

72K

Following

2K

Media

563

Statuses

2K

making AI work

https://t.co/ECE0dHmBSO

Joined July 2016

Don't wanna be here? Send us removal request.

Scale AI

@scale_AI

3 days

Learn more:

scale.com

MCP-Atlas is now open source: a benchmark measuring how AI agents use real tools across complex, multi-step tasks.

0

1

Scale AI

@scale_AI

3 days

We recently introduced MCP-Atlas, a benchmark for evaluating how well LLMs handle tool use via the Model Context Protocol. Even top models failed nearly half of realistic multi-tool tasks. Today, we’re open-sourcing the benchmark so you can measure performance yourself.

1

5

21

Scale AI

@scale_AI

4 days

🔗See how models stack up:

scale.com

Explore the SEAL leaderboard with expert-driven LLM benchmarks and updated AI model leaderboards, ranking top models across coding, reasoning and more.

0

5

Scale AI

@scale_AI

4 days

Speech isn’t just text read out loud. 💬 Real conversations are dynamic, full of interruptions, and context-rich — and benchmarks should match. Introducing Audio MultiChallenge (Audio MC), the first benchmark built to test how well native Speech-to-Speech models handle real

2

22

Scale AI

@scale_AI

4 days

Major drop today by @GoogleAI! ⚡️ Gemini 3 Flash scored🥈on MCP Atlas and tracking strong on Humanity’s Last Exam.

Logan Kilpatrick

@OfficialLoganK

5 days

Introducing Gemini 3 Flash, our frontier intelligence model, available at scale for everyone. It excels at coding, tool calling, and is stronger than 2.5 Pro across most metrics!! ⚡️ Available in the API at $0.50 in / 1M tokens and $3.00 out / 1M tokens across.

2

23

Greg Brockman

@gdb

5 days

GPT-5 Pro for very hard problems:

Scale AI

@scale_AI

6 days

GPT-5 Pro by @OpenAI is the Best Reasoning Model of 2025. 🏆 Calculated across SEAL’s reasoning leaderboards, GPT-5 Pro was the best at answering complicated questions, explaining its thinking, and solving multi-step problems.

24

19

425

Alex Heath

@alexeheath

5 days

talked to Scale's head of research about creating the Oscars for AI

sources.news

Scale's head of research: “Evaluation is falling behind the development of model capabilities."

1

5

21

Scale AI

@scale_AI

5 days

GPT-5 Chat by @OpenAI and Claude Sonnet 4.5 by @AnthropicAI are the People’s Favorite Models of 2025.🏆 Determined by performance on SEAL Showdown, where real users pick the better response in head-to-head comparisons, GPT-5 Chat and Sonnet 4.5 were the big winners.

0

2

18

Scale AI

@scale_AI

5 days

Claude Opus 4.5 by @AnthropicAI is the Best Agentic Model of 2025. 🏆 Across leaderboards that test models on ambiguous tasks — like multi-step projects and debugging — Opus 4.5 was the top performer.

1

5

24

Scale AI

@scale_AI

5 days

Gemini 3 by @GoogleAI is the Best Multimodal Model of 2025 🏆 When evaluating which models are best at understanding images alongside texts, Gemini 3 took the top spot.

0

2

16

Scale AI

@scale_AI

6 days

Claude Sonnet 4.5 by @AnthropicAI is the Best Safety Model of 2025. 🏆 Measuring across all safety evaluations, Sonnet 4.5 excelled at staying consistent, following safety guidelines, and avoiding unsafe outputs, even when under pressure.

1

2

21

Scale AI

@scale_AI

6 days

GPT-5 Pro by @OpenAI is the Best Reasoning Model of 2025. 🏆 Calculated across SEAL’s reasoning leaderboards, GPT-5 Pro was the best at answering complicated questions, explaining its thinking, and solving multi-step problems.

3

10

93

Scale AI

@scale_AI

6 days

Gemini 3 by @GoogleAI is the Best Composite Performance Model of 2025.🏆 The model was the top performer across all of the SEAL Leaderboards in 2025.

1

3

26

Scale AI

@scale_AI

6 days

See the full list of winners:

scale.com

Which models ruled 2025? Based on 450+ evals, see who topped the charts in the inaugural SEAL Models of the Year Awards.

0

2

7

Scale AI

@scale_AI

6 days

Introducing Scale’s Model of the Year Awards. 🏆 These awards, based entirely on SEAL Leaderboard performance, celebrate the best models across six major categories.

2

4

24

Scale AI

@scale_AI

6 days

Hundreds of models stand before us, but we only have six photos in our hands. Tune in tomorrow, December 16th to see who will be crowned Scale’s Next Top AI Models of 2025.

1

10

29