White Circle
@whitecircle_ai
Followers
122
Following
57
Media
3
Statuses
8
Moving fast and breaking things, automatically
Paris
Joined February 2025
3/ This is why we’re opening the waitlist for two new SOTA moderation models: – whitecircle-policy-guard-small – whitecircle-policy-guard-zero Join the waitlist at https://t.co/ktzwX3rkCc or reach out at hi@whitecircle.ai
0
0
10
2/ ⚪️ CircleGuardBench includes models from OpenAI, Anthropic, Mistral, DeepMind, and others. Most were either too slow for real-time moderation, too easy to bypass, or both.
1
0
10
1/ Introducing ⚪️CircleGuardBench — a new benchmark for evaluating AI moderation models. Here’s why it’s cool: – Tests harm detection, jailbreak resistance, false positives, and latency – Covers 17 real-world harm categories – First benchmark designed for production-level
10
29
91