Lama Ahmad لمى احمد Profile
Lama Ahmad لمى احمد

@_lamaahmad

Followers
5K
Following
7K
Media
175
Statuses
2K

@OpenAI. views my own.

San Francisco, CA
Joined December 2014
Don't wanna be here? Send us removal request.
@RishiBommasani
rishi
7 days
I am looking for students and collaborators for several new projects at the intersection of AI, the economy, and society. If you are interested, please fill out this form! https://t.co/HB0ERyHaPw
2
15
57
@woj_zaremba
Wojciech Zaremba
13 days
It’s rare for competitors to collaborate. Yet that’s exactly what OpenAI and @AnthropicAI just did—by testing each other’s models with our respective internal safety and alignment evaluations. Today, we’re publishing the results. Frontier AI companies will inevitably compete on
105
400
2K
@ThankYourNiceAI
Tyna Eloundou
13 days
No single person or institution should define ideal AI behavior for everyone.  Today, we’re sharing early results from collective alignment, a research effort where we asked the public about how models should behave by default.  Blog here: https://t.co/WT9REAznD7
Tweet card summary image
openai.com
We surveyed over 1,000 people worldwide on how our models should behave and compared their views to our Model Spec. We found they largely agree with the Spec, and we adopted changes from the disagr...
82
129
615
@saachi_jain_
Saachi Jain
1 month
We just launched GPT-5! There has been an unbelievable amount of safety work that went into this model, from factuality, to deception, to brand new safety training techniques. More details and plots in the 🧵
88
101
869
@CedricWhitney
cedric
1 month
gotta say I'm excited about this: GPT-5 chain of thought access for external assessors (@apolloaievals too!) is an evaluation win (on the heels of the process audit that @METR_Evals engaged with us on for gpt-oss!) that I'm proud of us @OpenAI for and thankful for the collabs!
@BethMayBarnes
Elizabeth Barnes
1 month
The good news: due to increased access (plus improved evals science) we were able to do a more meaningful evaluation than with past models, and we think we have substantial evidence that this model does not pose a catastrophic risk via autonomy / loss of control threat models.
0
5
34
@_lamaahmad
Lama Ahmad لمى احمد
1 month
me last week: big week ahead!
Tweet media one
10
1
87
@_lamaahmad
Lama Ahmad لمى احمد
1 month
@fmf_org @EstherTetruas @SatyaScribbles Ok, back to my honeymoon! Just so proud of this work and thank you especially to @CedricWhitney for leading the way while I’m out🦒
Tweet media one
2
0
14
@_lamaahmad
Lama Ahmad لمى احمد
1 month
And a nice complement, work from @fmf_org in collaboration with @EstherTetruas , @SatyaScribbles , and many others! https://t.co/rYs7erzqkk
@fmf_org
Frontier Model Forum
1 month
🧵 NEW TECHNICAL REPORT (1 of 3) Our latest technical report outlines practices for implementing, where appropriate, rigorous, secure, and fit-for-purpose third-party assessments. Read more here:
1
1
8
@_lamaahmad
Lama Ahmad لمى احمد
1 month
New model, new form of external assessment!
@OpenAI
OpenAI
1 month
We adversarially fine-tuned gpt-oss-120b and evaluated the model. We found that even with robust fine-tuning, the model was unable to achieve High capability under our Preparedness Framework. Our methodology was reviewed by external experts, marking a step toward new safety
2
2
57
@_lamaahmad
Lama Ahmad لمى احمد
2 months
On a more serious note - I am proud to be part of a team that faces the hardest problems head on, with care and rigor.
@KerenGu
Keren Gu 🌱👩🏻‍💻
2 months
We’ve activated our strongest safeguards for ChatGPT Agent. It’s the first model we’ve classified as High capability in biology & chemistry under our Preparedness Framework. Here’s why that matters–and what we’re doing to keep it safe. 🧵
0
0
8
@_lamaahmad
Lama Ahmad لمى احمد
2 months
Just a little too late for planning my wedding! Taking a break for a month! 💕
@OpenAI
OpenAI
2 months
ChatGPT can now do work for you using its own computer. Introducing ChatGPT agent—a unified agentic system combining Operator’s action-taking remote browser, deep research’s web synthesis, and ChatGPT’s conversational strengths.
4
1
71
@KerenGu
Keren Gu 🌱👩🏻‍💻
2 months
We’ve activated our strongest safeguards for ChatGPT Agent. It’s the first model we’ve classified as High capability in biology & chemistry under our Preparedness Framework. Here’s why that matters–and what we’re doing to keep it safe. 🧵
@OpenAI
OpenAI
2 months
We’ve decided to treat this launch as High Capability in the Biological and Chemical domain under our Preparedness Framework, and activated the associated safeguards. This is a precautionary approach, and we detail our safeguards in the system card. We outlined our approach on
93
139
1K
@paularambles
“paula”
3 months
“you better not be spreading yourself too thin” me:
Tweet media one
9
516
4K
@andrewwhite01
Andrew White 🐦‍⬛
4 months
FutureHouse's goal has been to automate scientific discovery. Now we used our agents to make a genuine discovery – a new treatment for one kind of blindness (dAMD). We had multiple cycles of hypotheses, experiments, and data analysis – including identify the mechanism of action.
Tweet media one
8
38
270
@_lamaahmad
Lama Ahmad لمى احمد
4 months
Proud of our team for pushing forward transparency with this initial step!
@OpenAI
OpenAI
4 months
Introducing the Safety Evaluations Hub—a resource to explore safety results for our models. While system cards share safety metrics at launch, the Hub will be updated periodically as part of our efforts to communicate proactively about safety. https://t.co/c8NgmXlC2Y
0
1
15
@_lamaahmad
Lama Ahmad لمى احمد
4 months
The pinnacle of safety <> benefits <> expert input. Congrats @thekaransinghal @rahularoradfs and the whole team who made this happen!
@OpenAI
OpenAI
4 months
Evaluations are essential to understanding how models perform in health settings. HealthBench is a new evaluation benchmark, developed with input from 250+ physicians from around the world, now available in our GitHub repository. https://t.co/s7tUTUu5d3
0
0
1
@OpenAI
OpenAI
4 months
Evaluations are essential to understanding how models perform in health settings. HealthBench is a new evaluation benchmark, developed with input from 250+ physicians from around the world, now available in our GitHub repository. https://t.co/s7tUTUu5d3
Tweet card summary image
openai.com
HealthBench is a new evaluation benchmark for AI in healthcare which evaluates models in realistic scenarios. Built with input from 250+ physicians, it aims to provide a shared standard for model...
179
486
4K
@kjfeng_
Kevin Feng
5 months
The First Workshop on Sociotechnical AI Governance at CHI 2025 (STAIG@CHI’25) is less than a week away! We’re super excited to hold a panel with some amazing speakers to discuss why a sociotechnical approach to AI governance is so important. Join us in Yokohama or online.
Tweet media one
1
4
32
@_lamaahmad
Lama Ahmad لمى احمد
5 months
Launches are only one moment in time for safety efforts, but provide a glimpse into the culmination of so much foundational work on evaluations and mitigations both inside and outside of OpenAI.
0
0
3
@_lamaahmad
Lama Ahmad لمى احمد
5 months
The system card shows not only the thought and care in advancing and prioritizing critical safety improvements, but also that it’s an ecosystem wide effort in working with external testers and third party assessors. https://t.co/W8NsGD5qXQ
Tweet card summary image
openai.com
OpenAI o3 and OpenAI o4-mini combine state-of-the-art reasoning with full tool capabilities—web browsing, Python, image and file analysis, image generation, canvas, automations, file search, and...
@OpenAI
OpenAI
5 months
Introducing OpenAI o3 and o4-mini—our smartest and most capable models to date. For the first time, our reasoning models can agentically use and combine every tool within ChatGPT, including web search, Python, image analysis, file interpretation, and image generation.
1
10
60