KerenGu Profile Banner
Keren Gu πŸŒ±πŸ‘©πŸ»β€πŸ’» Profile
Keren Gu πŸŒ±πŸ‘©πŸ»β€πŸ’»

@KerenGu

Followers
3K
Following
893
Media
18
Statuses
397

Safety Research @ OpenAI

San Francisco, CA
Joined November 2012
Don't wanna be here? Send us removal request.
@KerenGu
Keren Gu πŸŒ±πŸ‘©πŸ»β€πŸ’»
4 months
Months and months of safety preparations
@JoHeidecke
Johannes Heidecke
4 months
1 / Today we launched gpt-5, finishing a huge week for OpenAI. We’ve raised the bar for safety in both open and closed models. With gpt-oss and gpt-5, we introduced meaningful capability advancements with rigorous, industry-leading safeguards and safety testing.
1
1
18
@KerenGu
Keren Gu πŸŒ±πŸ‘©πŸ»β€πŸ’»
4 months
We take the risk of bio-misuse very seriously but also deeply care about advancing frontier biology research with our models. Will continue to iterate on both ends!
1
0
2
@KerenGu
Keren Gu πŸŒ±πŸ‘©πŸ»β€πŸ’»
4 months
We've updated our safeguards to be both more robust and precise, but you may still see that highly detailed dual use biology requests are being blocked. That's why we've introduced our Life Science Special Access program to Enterprise accounts.
help.openai.com
2
0
1
@KerenGu
Keren Gu πŸŒ±πŸ‘©πŸ»β€πŸ’»
4 months
Our safeguard report can be found in our system card
1
0
0
@KerenGu
Keren Gu πŸŒ±πŸ‘©πŸ»β€πŸ’»
4 months
GPT-5 with thinking is another model we are treating as High capability under the preparedness framework. We've activated biorisk safeguards in ChatGPT, like the ones in ChatGPT agent, and some new ones in the API. https://t.co/azqAfScsIo
@KerenGu
Keren Gu πŸŒ±πŸ‘©πŸ»β€πŸ’»
5 months
We’ve activated our strongest safeguards for ChatGPT Agent. It’s the first model we’ve classified as High capability in biology & chemistry under our Preparedness Framework. Here’s why that matters–and what we’re doing to keep it safe. 🧡
1
3
28
@KerenGu
Keren Gu πŸŒ±πŸ‘©πŸ»β€πŸ’»
4 months
Feeling grateful this morning to be pushing the frontier. πŸ₯°
2
0
15
@KerenGu
Keren Gu πŸŒ±πŸ‘©πŸ»β€πŸ’»
5 months
what a banger of a week over here
@MillionInt
Jerry Tworek
5 months
To summarize this week: - we released general purpose computer using agent - got beaten by a single human in atcoder heuristics competition - solved 5/6 new IMO problems with natural language proofs All of those are based on the same single reinforcement learning system
1
0
23
@alexwei_
Alexander Wei
5 months
1/N I’m excited to share that our latest @OpenAI experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the world’s most prestigious math competitionβ€”the International Math Olympiad (IMO).
405
1K
7K
@isafulf
Isa Fulford
5 months
tip for chatgpt agent slides: first ask it to do the research only, then ask it to make the slides!
42
33
503
@KerenGu
Keren Gu πŸŒ±πŸ‘©πŸ»β€πŸ’»
5 months
Securing AI bio capabilities extends beyond any one lab. We’ll continue researching, collaborating, and sharing what we learn. Read the full details in our detailed system card. https://t.co/03YahP3l5c
Tweet card summary image
openai.com
ChatGPT agent System Card: OpenAI’s agentic model unites research, browser automation, and code tools with safeguards under the Preparedness Framework.
2
0
41
@KerenGu
Keren Gu πŸŒ±πŸ‘©πŸ»β€πŸ’»
5 months
Because keeping models safe is an always-on process, we’re launching a bio bug bounty to identify jailbreaks so that we can address them quickly.
Tweet card summary image
openai.com
Testing universal jailbreaks for biorisks in ChatGPT Agent
6
5
65
@KerenGu
Keren Gu πŸŒ±πŸ‘©πŸ»β€πŸ’»
5 months
We also hosted a Biodefense workshop with government, NGOs and national labs to foster collaboration, discuss risk mitigation strategies, and supercharge biodefense research with AI.
4
1
36
@KerenGu
Keren Gu πŸŒ±πŸ‘©πŸ»β€πŸ’»
5 months
We ran 1000s of hours of red-teaming with global experts, including biology PhDs and jailbreakers. We worked with UK AISI, our Red Teaming Network and https://t.co/ad5HH18joa to harden our defenses.
far.ai
FAR.AI is an AI safety research non-profit facilitating technical breakthroughs and fostering global collaboration.
1
2
48
@KerenGu
Keren Gu πŸŒ±πŸ‘©πŸ»β€πŸ’»
5 months
We provided the US CAISI and the UK AISI with access to the model for red-teaming of our bio risk safeguards, using targeted queries to stress-test our models and monitors.
1
1
39
@KerenGu
Keren Gu πŸŒ±πŸ‘©πŸ»β€πŸ’»
5 months
Accordingly, we’ve designed and deployed our deepest safety stack yet with multi-layered mitigations: - Expert-validated threat model - Conservative dual-use refusals for risky content - Always-on safety classifiers - Streamlined enforcement & robust monitoring
3
0
44
@KerenGu
Keren Gu πŸŒ±πŸ‘©πŸ»β€πŸ’»
5 months
This is a pivotal moment for our Preparedness work. Before we reached High capability, Preparedness was about analyzing capabilities and planning safeguards. Now, for Agent and future more capable models, Preparedness safeguards have become an operational requirement.
2
1
47
@KerenGu
Keren Gu πŸŒ±πŸ‘©πŸ»β€πŸ’»
5 months
We ran a suite of preparedness evaluations to test the model’s capabilities. While we do not have definitive evidence that this model could meaningfully help a novice to create severe biological harm, we have chosen to take a precautionary approach and activate safeguards now.
3
1
47
@KerenGu
Keren Gu πŸŒ±πŸ‘©πŸ»β€πŸ’»
5 months
β€œHigh capability” is a risk-based threshold from our Preparedness Framework. We classify a model as High capability if, before any safety controls, it could significantly lower barriers to bio misuseβ€”even if risk isn’t certain.
Tweet card summary image
openai.com
Sharing our updated framework for measuring and protecting against severe harm from frontier AI capabilities.
3
1
63
@KerenGu
Keren Gu πŸŒ±πŸ‘©πŸ»β€πŸ’»
5 months
We’ve activated our strongest safeguards for ChatGPT Agent. It’s the first model we’ve classified as High capability in biology & chemistry under our Preparedness Framework. Here’s why that matters–and what we’re doing to keep it safe. 🧡
@OpenAI
OpenAI
5 months
We’ve decided to treat this launch as High Capability in the Biological and Chemical domain under our Preparedness Framework, and activated the associated safeguards. This is a precautionary approach, and we detail our safeguards in the system card. We outlined our approach on
85
132
1K
@JoHeidecke
Johannes Heidecke
6 months
1/ Our models are becoming more capable in biology and we expect upcoming models to reach β€˜High’ capability levels as defined by our Preparedness Framework. 🧡
179
246
1K