Keren Gu π±π©π»βπ»
@KerenGu
Followers
3K
Following
893
Media
18
Statuses
397
Safety Research @ OpenAI
San Francisco, CA
Joined November 2012
Months and months of safety preparations
1 / Today we launched gpt-5, finishing a huge week for OpenAI. Weβve raised the bar for safety in both open and closed models. With gpt-oss and gpt-5, we introduced meaningful capability advancements with rigorous, industry-leading safeguards and safety testing.
1
1
18
We take the risk of bio-misuse very seriously but also deeply care about advancing frontier biology research with our models. Will continue to iterate on both ends!
1
0
2
We've updated our safeguards to be both more robust and precise, but you may still see that highly detailed dual use biology requests are being blocked. That's why we've introduced our Life Science Special Access program to Enterprise accounts.
help.openai.com
2
0
1
Our safeguard report can be found in our system card
1
0
0
GPT-5 with thinking is another model we are treating as High capability under the preparedness framework. We've activated biorisk safeguards in ChatGPT, like the ones in ChatGPT agent, and some new ones in the API. https://t.co/azqAfScsIo
Weβve activated our strongest safeguards for ChatGPT Agent. Itβs the first model weβve classified as High capability in biology & chemistry under our Preparedness Framework. Hereβs why that mattersβand what weβre doing to keep it safe. π§΅
1
3
28
Feeling grateful this morning to be pushing the frontier. π₯°
2
0
15
what a banger of a week over here
To summarize this week: - we released general purpose computer using agent - got beaten by a single human in atcoder heuristics competition - solved 5/6 new IMO problems with natural language proofs All of those are based on the same single reinforcement learning system
1
0
23
tip for chatgpt agent slides: first ask it to do the research only, then ask it to make the slides!
42
33
503
Securing AI bio capabilities extends beyond any one lab. Weβll continue researching, collaborating, and sharing what we learn. Read the full details in our detailed system card. https://t.co/03YahP3l5c
openai.com
ChatGPT agent System Card: OpenAIβs agentic model unites research, browser automation, and code tools with safeguards under the Preparedness Framework.
2
0
41
Because keeping models safe is an always-on process, weβre launching a bio bug bounty to identify jailbreaks so that we can address them quickly.
openai.com
Testing universal jailbreaks for biorisks in ChatGPT Agent
6
5
65
We also hosted a Biodefense workshop with government, NGOs and national labs to foster collaboration, discuss risk mitigation strategies, and supercharge biodefense research with AI.
4
1
36
We ran 1000s of hours of red-teaming with global experts, including biology PhDs and jailbreakers. We worked with UK AISI, our Red Teaming Network and https://t.co/ad5HH18joa to harden our defenses.
far.ai
FAR.AI is an AI safety research non-profit facilitating technical breakthroughs and fostering global collaboration.
1
2
48
We provided the US CAISI and the UK AISI with access to the model for red-teaming of our bio risk safeguards, using targeted queries to stress-test our models and monitors.
1
1
39
Accordingly, weβve designed and deployed our deepest safety stack yet with multi-layered mitigations: - Expert-validated threat model - Conservative dual-use refusals for risky content - Always-on safety classifiers - Streamlined enforcement & robust monitoring
3
0
44
This is a pivotal moment for our Preparedness work. Before we reached High capability, Preparedness was about analyzing capabilities and planning safeguards. Now, for Agent and future more capable models, Preparedness safeguards have become an operational requirement.
2
1
47
We ran a suite of preparedness evaluations to test the modelβs capabilities. While we do not have definitive evidence that this model could meaningfully help a novice to create severe biological harm, we have chosen to take a precautionary approach and activate safeguards now.
3
1
47
βHigh capabilityβ is a risk-based threshold from our Preparedness Framework. We classify a model as High capability if, before any safety controls, it could significantly lower barriers to bio misuseβeven if risk isnβt certain.
openai.com
Sharing our updated framework for measuring and protecting against severe harm from frontier AI capabilities.
3
1
63
Weβve activated our strongest safeguards for ChatGPT Agent. Itβs the first model weβve classified as High capability in biology & chemistry under our Preparedness Framework. Hereβs why that mattersβand what weβre doing to keep it safe. π§΅
Weβve decided to treat this launch as High Capability in the Biological and Chemical domain under our Preparedness Framework, and activated the associated safeguards. This is a precautionary approach, and we detail our safeguards in the system card. We outlined our approach on
85
132
1K
1/ Our models are becoming more capable in biology and we expect upcoming models to reach βHighβ capability levels as defined by our Preparedness Framework. π§΅
179
246
1K