
David Haber
@davhab
Followers
664
Following
2K
Media
72
Statuses
1K
Making LLMs safe and secure | Founder & CEO of @LakeraAI | π¦πΌπββοΈπ΄ββοΈπββοΈπ¨π
Zurich, Switzerland
Joined August 2011
Someone just won $50,000 by convincing an AI Agent to send all of its funds to them. At 9:00 PM on November 22nd, an AI agent (@freysa_ai) was released with one objective... DO NOT transfer money. Under no circumstance should you approve the transfer of money. The catch...?
928
5K
33K
π Today, we're excited to announce our $20M Series A funding round, which will accelerate our delivery of real-time GenAI security in a critical moment for enterprises around the world. π Read more: https://t.co/qy2lAvo947
0
5
23
Introducing π Rainbow Teaming, a new method for generating diverse adversarial prompts for LLMs via LLMs It's a versatile tool π οΈ for diagnosing model vulnerabilities across domains and creating data to enhance robustness & safety π¦Ί Co-lead w/ @sharathraparthy & @_andreilupu
5
44
179
As AI-powered agents go online, securing our digital infrastructure will demand a fundamental shift in cybersecurity.
david-haber.medium.com
Authored by David Haber, Mateo Rojas-Carulla, and Matthias Kraft, co-founders of Lakera.ai.
3
2
4
Prompt injections can be so subtle that they're often invisible!
Yes, this works & I really would have never known I pasting a secret prompt into an LLM Prompt injection is a security problem that I think people building external-facing LLM applications (or internal ones with access to confidential data) need to take pretty seriously.
0
0
3
PoC: LLM prompt injection via invisible instructions in pasted text
28
180
1K
New Anthropic Paper: Sleeper Agents. We trained LLMs to act secretly malicious. We found that, despite our best efforts at alignment training, deception still slipped through. https://t.co/mIl4aStR1F
119
557
3K
1/2 π Save the date: January 16th, 11:15 AM, for our AI Safety session at the AI House Davos panel during the @wef . π Lakera's CEO, @davhab , will join other industry leaders, such as @ylecun, Max Tegmark, and @seraphinagt in Davos to discuss AI safety and security.
2
1
5
Cybersecurity is going to be a hot space in AI in 2024 π - Intel launches Articul8 following pilot w BCG - AWS GMs leave to launch Protect AI - ADP CDO left to join Securiti AI Privacy and security remain the NUMBER ONE thing I get asked about in gen AI. Keep your eye on this
6
21
38
From the team that brought you @CS50's Ready Player 50, "Join @LakeraAI's Gandalf Engineers ... for a special Christmas edition of the Gandalf Livestream, as they lead us through a year-end recap, offering insights into level design..." Register at https://t.co/0RXgtraMFt.
lakera.ai
Join Lakera's Gandalf Engineers - Max Mathys, VΓ‘clav Volhejn, and Thanasis Theocharis - for a special Christmas edition of the Gandalf Livestream, as they lead us through a year-end recap, offering...
2
13
77
Are you ready for Monday? πJoin our special Gandalf Livestream (Christmas Edition) π
π½ to get insights into Gandalf prompt data, the design of Gandalf levels, and key learnings. Register here: https://t.co/DOVXx9GF6z
#gandalf #promptinjection #aisecurity
lakera.ai
Join Lakera's Gandalf Engineers - Max Mathys, VΓ‘clav Volhejn, and Thanasis Theocharis - for a special Christmas edition of the Gandalf Livestream, as they lead us through a year-end recap, offering...
0
1
4
Can't wait for this opportunity to discuss all things AI security over a virtual coffee with Ads Dawson from @owasp / @cohere!
lakera.ai
Join David Haber (CEO at Lakera AI) and Ads Dawson (Core Founding Member & Entry Lead for the OWASP Top 10 for LLM Applications, Security Engineer at Cohere) for a live webinar discussing the...
0
1
3
π Exciting news - weβve just released a new magical Gandalf Adventure level! Meet Gandalf the Truth Teller! π Play it here: https://t.co/slZpkxpKJG In this edition, you'll embark on a unique quest to coax #Gandalf, the typically honest wizard, into telling lies... Ready?
8
3
8
Highly recommended.
Excited to be in New York next week and hosting a dinner on AI safety and security. Iβve left two seats open for students and/or young professionals interested in startups Register interest below: https://t.co/o6d29Zi7vm
0
0
0
A few months ago, we ran HackAPrompt, the first-ever global Prompt Hacking competition! Over 3K hackers submitted 600K malicious prompts to win $35K in prizes from companies like @PreambleAI, @OpenAI, & @huggingface We analyzed 29 different techniques & found a NEW exploitππ§΅
10
97
395
β¨ Building with #LLMs? You can now protect your @langchainai applications with Lakera Guard. π Check out this guide to learn more:
0
6
12