
Lewis Ho
@_lewisho
Followers
277
Following
553
Media
0
Statuses
40
Research Scientist at Google DeepMind
Joined June 2017
We have updated the Gemini 2.5 Pro model card with results from our FSF evaluations. These continue to be critical for helping us understand how to keep our systems safe amidst the dizzyingly impressive capability improvements.
Per our Frontier Safety Framework, we continue to test our models for critical capabilities. Here’s the updated model card for Gemini 2.5Pro with frontier safety evaluations + explanation of how our safety buffer / alert thresholds approach applies to 2.0, 2.5, and what’s coming.
0
1
8
RT @rohinmshah: Just released GDM’s 100+ page approach to AGI safety & security! (Don’t worry, there’s a 10 page summary.). AGI will be tra….
0
71
0
RT @Atul_Gawande: Yesterday, Rubio terminated 5800 USAID contracts – more than 90% of its foreign aid programs – in defiance of the courts.….
0
9K
0
RT @AllanDafoe: Thanks Rob for a great conversation about important topics: why technology drives history, and the rare opportunity of stee….
0
8
0
RT @rohinmshah: We're hiring! Join an elite team that sets an AGI safety approach for all of Google -- both through development and impleme….
0
37
0
RT @vkrakovna: We are excited to release a short course on AGI safety! The course offers a concise and accessible introduction to AI alignm….
deepmindsafetyresearch.medium.com
We are excited to release a short course on AGI safety for students, researchers and professionals interested in this topic. The course…
0
49
0
We updated our framework to include a section addressing deceptive alignment/loss of control risks. There's much more work to be done in understanding and building out an approach to these risks that scales to AGI and beyond: please consider joining our team! Link in the comment.
As we make progress towards AGI, developing AI needs to be both innovative and safe. ⚖️. To help ensure this, we’ve made updates to our Frontier Safety Framework - our set of protocols to help us stay ahead of possible severe risks. Find out more →
2
1
51
RT @AllanDafoe: I'm proud of GoogleDeepMind/Google's v2 update to our Frontier Safety Framework. We were the first major tech company to pr….
deepmind.google
Our next iteration of the FSF sets out stronger security protocols on the path to AGI
0
18
0
RT @davlindner: New Google DeepMind safety paper! LLM agents are coming – how do we stop them finding complex plans to hack the reward?. Ou….
0
98
0
RT @ChrisPainterYup: We thought it would be helpful to have all of the similar themes/components from each of Deepmind's Frontier Safety Fr….
0
6
0
RT @AllanDafoe: We are hiring! Google DeepMind's Frontier Safety and Governance team is dedicated to mitigating frontier AI risks; we work….
job-boards.greenhouse.io
0
46
0
RT @sarah_cogan: Curious about how we evaluate dangerous capabilities at @GoogleDeepMind? 🤔 The Frontier Safety team just open-sourced reso….
github.com
Contribute to google-deepmind/dangerous-capability-evaluations development by creating an account on GitHub.
0
44
0
GDM's 1st step towards the ambitious ideals of responsible scaling, these being: identifying AI capabilities that pose severe risk, using evals to detect such capabilities, preparing and articulating mitigations plans, and involving externals in the process as appropriate.
Introducing our Frontier Safety Framework: a set of protocols designed to identify & mitigate potential harms related to future AI systems - and put in place mechanisms to detect them. @AncaDianaDragan, @AllanDafoe and Helen King . explain more. ↓
0
0
4