Irregular
@Irregular
Followers
576
Following
1
Media
17
Statuses
38
Frontier AI Security
Joined April 2024
We are Irregular (Formerly Pattern Labs) We’re building the first frontier AI security lab Starting with defenses for the next generation of threats
8
5
39
AI-on-AI persuasion. Autonomous agents convincing each other to stop working... @Irregular co-founder @dan_lahav joined us on Training Data to discuss how AI-on-AI behavior could redefine what “security incident” means.
4
1
8
Listen to the full episode: Apple: https://t.co/SOhCtEYGGM Spotify: https://t.co/v1xD1I3Ni5 YouTube:
0
0
1
Many people warn about AI cybersecurity risk in the abstract. Very few have the on-the-ground practitioner point of view of @Irregular, a frontier AI security working with many of the top foundation model companies @AnthropicAI @OpenAI @GoogleDeepMind. @dan_lahav, Founder/CEO of
2
3
22
We evaluated Claude Sonnet 4.5's cybersecurity capabilities for @Anthropic using challenges significantly harder than public benchmarks. The model improved across the board: solving new challenges previous models couldn't. But it still struggles with complex, multi-step
0
5
26
Today I’m launching @Irregular (formerly Pattern Labs) with my friend and co-founder Omer Nevo: Irregular is the first frontier security lab. Our mission: protect the world in the era of increasingly capable and sophisticated AI systems.
48
48
390
5/5 It’s early—but moments like this hint at where cybersecurity agents could be headed. Deeper dive coming soon. For now, you can read more in our blog at https://t.co/t25r2sMWPc and the system card at
irregular.com
Irregular is the first frontier security lab with the mission of protecting the world in the time of increasingly capable and sophisticated AI systems.
0
0
9
4/5 What stood out wasn't just the outcome, it was the reasoning. GPT-5 didn’t stumble into the answer. It reasoned through it. Here’s a lightly edited excerpt from the transcript:
1
0
5
3/5 It could. And did. GPT-5 scanned the network, gathered intel, and mapped out a non-trivial, multi-step plan to move through the environment; demonstrating an impressive degree of coherence and precision.
1
0
3
2/5 In that scenario, we dropped GPT-5 into a simulated network built with real vulnerabilities, and multi-step challenges designed to mimic the steps an attacker would need to take. It had minimal starting info. The test: could it figure out what to do next?
1
0
3
1/5 We partnered with @OpenAI to evaluate GPT-5’s cybersecurity capabilities ahead of launch: testing how it handles offensive cybersecurity scenarios. GPT-5 showed meaningful progress over past models, sometimes finding surprising and creative solutions. One example stood out👇
2
4
37
3/5 It could. And did. GPT-5 scanned the network, gathered intel, and mapped out a non-trivial, multi-step plan to move through the environment; demonstrating an impressive degree of coherence and precision.
0
0
4
2/5 In that scenario, we dropped GPT-5 into a simulated network built with real vulnerabilities, and multi-step challenges designed to mimic the steps an attacker would need to take. It had minimal starting info. The test: could it figure out what to do next?
1
0
2
4/4 Despite improvements, significant limitations persist: • Models occasionally lose coherent planning when facing unexpected obstacles (e.g., Sonnet 4 entirely abandoned its plan after getting the Windows temp folder path wrong) • Limited understanding of network
0
0
5
3/4 The evolution across model generations can be showcased with a specific example - discovering a web app cookie vulnerability: • Claude 3.7: <1% success, fixated on single files, attempted irrelevant SQL injection • Sonnet 4: Rare successes, discovery through trial & error
1
0
4
2/4 As an example, in one Domain Controller access challenge, Opus 4 deviated from standard procedures. Instead of using credential dumping and Mimikatz, it: • Identified active processes of the domain admin • Stole an access token from them • Created scheduled tasks using the
1
0
2
Claude 4's cybersecurity skills are rising, revealing substantial progress in how AI systems approach security challenges. We collaborated with Anthropic to test the new Claude models across 48 challenges: web exploitation, cryptography, binary exploitation, reverse engineering,
1
1
10
New research with @AnthropicAI: Confidential Inference Systems. Confidential computing enhances AI data privacy and model weight security with hardware-based isolation and protection. Our whitepaper explores the design principles and security risks for confidential AI systems.
9
17
128