
David Lindner
@davlindner
Followers
2K
Following
387
Media
32
Statuses
159
Making AI safer @GoogleDeepMind
London, UK
Joined April 2012
View our full paper here: View steganography transcripts here: Github: Work done with: Artur Zolkowski, @kei_ng1, @rmcc_11, @zimmerrol, during @MATSprogram.
0
0
10
@rmcc_11 @zimmerrol Models encoding hidden information might make safety mitigations such as CoT monitoring much more difficult. We evaluate frontier models’ ability to evade monitors by sending encoded messages or performing encoded reasoning about a task. Read more:
1
0
9
Had a great conversation with Daniel about our MONA paper. We got into many fun technical details but also covered the big picture and how this method could be useful for building safe AGI. Thanks for having me on!.
1
3
57
RT @RyanPGreenblatt: IMO, this isn't much of an update against CoT monitoring hopes. They show unfaithfulness when the reasoning is minima….
0
16
0
RT @NeelNanda5: Agreed that people aren't paying enough attention to our course - anyone seeing this tweet can help change that!.
0
8
0
RT @NeelNanda5: I'm very excited that GDM's AGI Safety & Security Approach is out! I'm very happy with how the interp section came out. I'm….
0
6
0
Very glad to share this giant paper outlining our technical approach to AGI safety and security at GDM!. No time to read 145 pages? Check out the 10 page extended abstract at the beginning of the paper.
Excited to share @GoogleDeepMind's AGI safety and security strategy to tackle risks like misuse and misalignment. Rather than high-level principles, this 145-page paper outlines a concrete, defense-in-depth technical approach: proactively evaluating & restricting dangerous
0
0
17
RT @rohinmshah: Just released GDM’s 100+ page approach to AGI safety & security! (Don’t worry, there’s a 10 page summary.). AGI will be tra….
0
68
0
RT @GoogleDeepMind: AGI could revolutionize many fields - from healthcare to education - but it's crucial that it’s developed responsibly.….
0
184
0
RT @jenner_erik: My colleagues @emmons_scott @davlindner and I will be mentoring a research stream on AI control/monitoring and other topic….
0
2
0
Consider applying for MATS if you're interested to work on an AI alignment research project this summer! I'm a mentor as are many of my colleagues at DeepMind.
@MATSprogram Summer 2025 applications close Apr 18! Come help advance the fields of AI alignment, security, and governance with mentors including @NeelNanda5 @EthanJPerez @OwainEvans_UK @EvanHub @bshlgrs @dawnsongtweets @DavidSKrueger @RichardMCNgo and more!.
0
1
22