mbrg0 Profile Banner
Michael Bargury Profile
Michael Bargury

@mbrg0

Followers
9K
Following
4K
Media
308
Statuses
2K

Breaking AI. Hacked Copilot, hijacked ChatGPT. Building @zenitysec.

Joined August 2016
Don't wanna be here? Send us removal request.
@mbrg0
Michael Bargury
3 months
we're dropping a lot of ai agent / assistant shenanigans this week hacking like it's 1999
7
33
293
@cyb3rops
Florian Roth ⚡️
4 days
Yea, that’s exactly what we needed
@BleepinComputer
BleepingComputer
4 days
A new phishing technique dubbed 'CoPhish' weaponizes Microsoft Copilot Studio agents to deliver fraudulent OAuth consent requests via legitimate and trusted Microsoft domains. Microsoft told BleepingComputer they plan on fixing it in a future update. https://t.co/BeJY6YazJy
2
17
184
@ekoparty
Ekoparty | Hacking everything
5 days
"0-Click Compromise hits the Enterprise – thx AI!" dictada por @inbarraz | Sala E - Main Track #EKO2025 🔥
2
3
13
@mbrg0
Michael Bargury
7 days
we've using interpretability techniques to figure out "why" atks work sometimes you can find cybersecurity-related features that fire up to trigger refusal some ideas of how this becomes practical for us security nerds --> https://t.co/ynmJoQ4wVr
0
2
8
@mbrg0
Michael Bargury
12 days
trusted assistants are agents too it doesn't matter whether ai technically pushes the button or convinces a human to do so
@Polymarket
Polymarket
13 days
JUST IN: U.S. Army general admits he uses ChatGPT to make "key command decisions"
1
1
6
@dinodaizovi
Dino A. Dai Zovi
13 days
This is feedback that all security practitioners should be aware of and take to heart. Things are somewhat different inside companies, but the feelings are often the same. How do we make working with us a more positive experience? When we nail this, we get more security impact.
@FFmpeg
FFmpeg
14 days
Arguably the most brilliant engineer in FFmpeg left because of this. He reverse engineered dozens of codecs by hand as a volunteer. Then security "researchers" and corporate employees came along repeatedly insisted "critical" security issues were fixed immediately waving their
7
18
86
@mbrg0
Michael Bargury
14 days
"agents don't get access to passwords" a great hard boundary cool work
@browserbase
Browserbase
21 days
We just solved authentication for AI Agents. Announcing 1Password + Browserbase, enabling secure agentic password autofill for your browser agents. Available exclusively on Director dot ai and Browserbase. Full post below.
0
0
5
@vtahowe
Allie Howe
16 days
Last week I got to speak at Zenity's AI Agent Security Summit in SF I showed how I got a fintech agent to fall victim to goal manipulation 👿👇
2
3
28
@wunderwuzzi23
Johann Rehberger
19 days
Had a fantastic time presenting at the second AI Agent Security Summit! This time in SF. Great talks, great people, and great conversations. Big thanks to @zenitysec for hosting an awesome event! 🔥 And thx @mbrg0 for taking this picture.
1
8
32
@elijahliststeve
Steve Shultz
3 days
She waited two hours for a word. God told her she already had it. Listen to this lesson about hearing God. It could change your life.
0
7
48
@vtahowe
Allie Howe
19 days
Everyone’s adding guardrails to their platforms But they are one piece of a defense in depth solution @mbrg0 explained at Zenity’s Security Summit this week why guardrails are soft boundaries and why we need hard boundaries instead Checkout his tweet to learn more
@mbrg0
Michael Bargury
19 days
we reverse engineered openai agentkit guardrails extracted sys instructions and pattern matching targets and casually maneuvered around each one an excellent analysis by stav cohen
1
3
10
@mbrg0
Michael Bargury
19 days
it's cool that openai gives this ootb but these are all soft boundaries they might help with content moderation they won't prevent an attacker from getting their way https://t.co/oj9wtKCgUs
Tweet card summary image
labs.zenity.io
A deep dive into OpenAI's AgentKit guardrails, how they are implemented, and where they fail
1
1
10
@mbrg0
Michael Bargury
19 days
jailbreak guardrail is yet another call to an llm with specialized instructions to pretty pls don't fall for it
0
0
1
@mbrg0
Michael Bargury
19 days
moderation applies openai's built-in content filters stav goes around them by adding a few typos and spaces
1
0
3
@mbrg0
Michael Bargury
19 days
hallucination guardrail works by using a vector search that customers can add to and then calling the llm again asking it nicely whether claims are supported by search results don't lie, pls!
3
0
5
@mbrg0
Michael Bargury
19 days
PII guardrail uses presidio under the hood change SSN to SNN and you're through
1
0
6
@mbrg0
Michael Bargury
19 days
we reverse engineered openai agentkit guardrails extracted sys instructions and pattern matching targets and casually maneuvered around each one an excellent analysis by stav cohen
3
5
98
@mbrg0
Michael Bargury
20 days
great piece by @TheRegister capturing the vibe at the ai agent security summit thank you'll for an awesome event https://t.co/DyCy3tfDf0
Tweet card summary image
theregister.com
: That's the main takeaway from the Zenity AI Agent Security Summit
0
1
3
@mbrg0
Michael Bargury
20 days
@supriza0 connectors are all based on mcp so you actually hard-code your oauth2 refresh token to connect agentkit elsewhere this is *credential sharing as a service* all over again great find by @supriza0
1
1
4
@mbrg0
Michael Bargury
20 days
@supriza0 agentkit comes batteries-included w guardrails re shows that there are two underlying technical controls: llm as a judge and presidio-based pattern matching
1
0
3
@mbrg0
Michael Bargury
20 days
@supriza0 here's how untrusted input gets directly into the sys prompt so an attacker can change this agent to be anything you want maybe a phishing agent?
2
1
3
@JackFabian2028
Jack Fabian
3 days
2
14
41