AnthropicAI Profile Banner
Anthropic Profile
Anthropic

@AnthropicAI

Followers
627K
Following
1K
Media
469
Statuses
1K

We're an AI safety and research company that builds reliable, interpretable, and steerable AI systems. Talk to our AI assistant @claudeai on https://t.co/FhDI3KQh0n.

Joined January 2021
Don't wanna be here? Send us removal request.
@AnthropicAI
Anthropic
17 days
Today we're releasing Claude Opus 4.1, an upgrade to Claude Opus 4 on agentic tasks, real-world coding, and reasoning.
Tweet media one
604
1K
10K
@AnthropicAI
Anthropic
11 hours
If you’re interested in joining us to work on these and related issues, you can apply for our Research Engineer/Scientist role ( on the Alignment Science team.
Tweet card summary image
job-boards.greenhouse.io
San Francisco, CA
8
2
59
@grok
Grok
3 days
Join millions who have switched to Grok.
175
196
1K
@AnthropicAI
Anthropic
11 hours
There’s plenty of work to be done to make the classifiers even more accurate and effective. In the future, they might even be able to remove data relevant to misalignment risks (scheming, deception, and so on), as well as CBRN risks.
6
2
62
@AnthropicAI
Anthropic
11 hours
One concern is that filtering CBRN data will reduce performance on other, harmless capabilities—especially science. But we found a setup where the classifier reduced CBRN accuracy by 33% beyond a random baseline with no particular effect on a range of other benign tasks.
Tweet media one
2
2
49
@AnthropicAI
Anthropic
11 hours
We trained six different classifiers to detect and remove CBRN information from training data. The best and most efficient results were from a classifier that used a small model from the Claude 3 Sonnet series to flag the harmful data.
3
1
46
@AnthropicAI
Anthropic
11 hours
The wealth of data used in AI training contains hazardous CBRN information. Developers usually train models not to use it. Here, we tried removing the information at the source, so even if models are jailbroken, the info isn't available. Read more:
3
3
49
@AnthropicAI
Anthropic
11 hours
New Anthropic research: filtering out dangerous information at pretraining. We’re experimenting with ways to remove information about chemical, biological, radiological and nuclear (CBRN) weapons from our models’ training data without affecting performance on harmless tasks.
Tweet media one
108
71
771
@AnthropicAI
Anthropic
1 day
We’re also announcing a new Higher Education Advisory Board, which helps guide how Claude is used in teaching, learning, and research. Read more about the courses and the Board:
Tweet card summary image
anthropic.com
Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.
2
21
151
@AnthropicAI
Anthropic
1 day
We’ve made three new AI fluency courses, co-created with educators, to help teachers and students build practical, responsible AI skills. They’re available for free to any institution.
Tweet media one
50
199
2K
@AnthropicAI
Anthropic
2 days
We don't need to choose between innovation and safety. With the right public-private partnerships, we can have both. We’re sharing our approach with @fmf_org members so any AI company can implement similar protections. Read more:
Tweet card summary image
anthropic.com
Together with the NNSA and DOE national laboratories, we have co-developed a classifier—an AI system that automatically categorizes content—that distinguishes between concerning and benign nuclear-...
5
6
62
@AnthropicAI
Anthropic
2 days
This demonstrates what's possible when government expertise meets industry capability. NNSA understands nuclear risks better than any company could; we have the technical capacity to build the safeguards.
1
2
52
@AnthropicAI
Anthropic
2 days
NNSA shared their Nuclear Risk Indicators List, which distinguishes concerning from benign nuclear conversations. We used this to build a classifier, a set of systems that automatically categorize content.
1
2
38
@AnthropicAI
Anthropic
2 days
Nuclear knowledge is dual-use. The same physics that powers reactors can enable weapons. We had to be precise—blocking harmful content without restricting nuclear engineering homework, medical treatments, or energy policy discussions.
5
2
61
@AnthropicAI
Anthropic
2 days
We partnered with @NNSANews to build first-of-their-kind nuclear weapons safeguards for AI. We've developed a classifier that detects nuclear weapons queries while preserving legitimate uses for students, doctors, and researchers.
Tweet media one
117
85
874
@AnthropicAI
Anthropic
2 days
RT @claudeai: Claude Code is now available on Team and Enterprise plans. Flexible pricing lets you mix standard and premium Claude Code se….
0
354
0
@AnthropicAI
Anthropic
7 days
Join Anthropic interpretability researchers @thebasepoint, @mlpowered, and @Jack_W_Lindsey as they discuss looking into the mind of an AI model - and why it matters:
71
185
1K
@AnthropicAI
Anthropic
7 days
The vast majority of users will never experience Claude ending a conversation, but if you do, we welcome feedback. Read more:
anthropic.com
An update on our exploratory research on model welfare
18
7
413
@AnthropicAI
Anthropic
7 days
This is an experimental feature, intended only for use by Claude as a last resort in extreme cases of persistently harmful and abusive conversations.
28
7
576
@AnthropicAI
Anthropic
7 days
As part of our exploratory work on potential model welfare, we recently gave Claude Opus 4 and 4.1 the ability to end a rare subset of conversations on
Tweet media one
356
194
3K
@AnthropicAI
Anthropic
8 days
A reminder that applications for our Anthropic Fellows program are due by this Sunday, August 17. Fellowships can start anytime from October to January. You can find more details, and the relevant application links, in the thread below.
@AnthropicAI
Anthropic
24 days
We’re running another round of the Anthropic Fellows program. If you're an engineer or researcher with a strong coding or technical background, you can apply to receive funding, compute, and mentorship from Anthropic, beginning this October. There'll be around 32 places.
Tweet media one
30
177
692
@AnthropicAI
Anthropic
10 days
We discuss policy development, model training, testing and evaluation, real-time monitoring, enforcement, and more. Read the post:
Tweet card summary image
anthropic.com
Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.
10
8
91