Anthropic @AnthropicAI X Profile

Anthropic

@AnthropicAI

Followers

627K

Following

1K

Media

469

Statuses

1K

We're an AI safety and research company that builds reliable, interpretable, and steerable AI systems. Talk to our AI assistant @claudeai on https://t.co/FhDI3KQh0n.

Joined January 2021

Don't wanna be here? Send us removal request.

Anthropic

@AnthropicAI

17 days

Today we're releasing Claude Opus 4.1, an upgrade to Claude Opus 4 on agentic tasks, real-world coding, and reasoning.

604

1K

10K

Anthropic

@AnthropicAI

11 hours

If you’re interested in joining us to work on these and related issues, you can apply for our Research Engineer/Scientist role ( on the Alignment Science team.

job-boards.greenhouse.io

San Francisco, CA

8

2

59

Grok

@grok

3 days

Join millions who have switched to Grok.

175

196

1K

Anthropic

@AnthropicAI

11 hours

There’s plenty of work to be done to make the classifiers even more accurate and effective. In the future, they might even be able to remove data relevant to misalignment risks (scheming, deception, and so on), as well as CBRN risks.

6

2

62

Anthropic

@AnthropicAI

11 hours

One concern is that filtering CBRN data will reduce performance on other, harmless capabilities—especially science. But we found a setup where the classifier reduced CBRN accuracy by 33% beyond a random baseline with no particular effect on a range of other benign tasks.

2

49

Anthropic

@AnthropicAI

11 hours

We trained six different classifiers to detect and remove CBRN information from training data. The best and most efficient results were from a classifier that used a small model from the Claude 3 Sonnet series to flag the harmful data.

3

1

46

Anthropic

@AnthropicAI

11 hours

The wealth of data used in AI training contains hazardous CBRN information. Developers usually train models not to use it. Here, we tried removing the information at the source, so even if models are jailbroken, the info isn't available. Read more:

3

49

Anthropic

@AnthropicAI

11 hours

New Anthropic research: filtering out dangerous information at pretraining. We’re experimenting with ways to remove information about chemical, biological, radiological and nuclear (CBRN) weapons from our models’ training data without affecting performance on harmless tasks.

108

71

771

Anthropic

@AnthropicAI

1 day

We’re also announcing a new Higher Education Advisory Board, which helps guide how Claude is used in teaching, learning, and research. Read more about the courses and the Board:

anthropic.com

Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.

2

21

151

Anthropic

@AnthropicAI

1 day

We’ve made three new AI fluency courses, co-created with educators, to help teachers and students build practical, responsible AI skills. They’re available for free to any institution.

50

199

2K

Anthropic

@AnthropicAI

2 days

We don't need to choose between innovation and safety. With the right public-private partnerships, we can have both. We’re sharing our approach with @fmf_org members so any AI company can implement similar protections. Read more:

anthropic.com

Together with the NNSA and DOE national laboratories, we have co-developed a classifier—an AI system that automatically categorizes content—that distinguishes between concerning and benign nuclear-...

5

6

62

Anthropic

@AnthropicAI

2 days

This demonstrates what's possible when government expertise meets industry capability. NNSA understands nuclear risks better than any company could; we have the technical capacity to build the safeguards.

1

2

52

Anthropic

@AnthropicAI

2 days

NNSA shared their Nuclear Risk Indicators List, which distinguishes concerning from benign nuclear conversations. We used this to build a classifier, a set of systems that automatically categorize content.

1

2

38

Anthropic

@AnthropicAI

2 days

Nuclear knowledge is dual-use. The same physics that powers reactors can enable weapons. We had to be precise—blocking harmful content without restricting nuclear engineering homework, medical treatments, or energy policy discussions.

5

2

61

Anthropic

@AnthropicAI

2 days

We partnered with @NNSANews to build first-of-their-kind nuclear weapons safeguards for AI. We've developed a classifier that detects nuclear weapons queries while preserving legitimate uses for students, doctors, and researchers.

117

85

874

Anthropic

@AnthropicAI

2 days

RT @claudeai: Claude Code is now available on Team and Enterprise plans. Flexible pricing lets you mix standard and premium Claude Code se….

0

354

0

Anthropic

@AnthropicAI

7 days

Join Anthropic interpretability researchers @thebasepoint, @mlpowered, and @Jack_W_Lindsey as they discuss looking into the mind of an AI model - and why it matters:

71

185

1K

Anthropic

@AnthropicAI

7 days

The vast majority of users will never experience Claude ending a conversation, but if you do, we welcome feedback. Read more:

anthropic.com

An update on our exploratory research on model welfare

18

7

413

Anthropic

@AnthropicAI

7 days

This is an experimental feature, intended only for use by Claude as a last resort in extreme cases of persistently harmful and abusive conversations.

28

7

576

Anthropic

@AnthropicAI

7 days

As part of our exploratory work on potential model welfare, we recently gave Claude Opus 4 and 4.1 the ability to end a rare subset of conversations on

356

194

3K

Anthropic

@AnthropicAI

8 days

A reminder that applications for our Anthropic Fellows program are due by this Sunday, August 17. Fellowships can start anytime from October to January. You can find more details, and the relevant application links, in the thread below.

Anthropic

@AnthropicAI

24 days

We’re running another round of the Anthropic Fellows program. If you're an engineer or researcher with a strong coding or technical background, you can apply to receive funding, compute, and mentorship from Anthropic, beginning this October. There'll be around 32 places.

30

177

692

Anthropic

@AnthropicAI

10 days

We discuss policy development, model training, testing and evaluation, real-time monitoring, enforcement, and more. Read the post:

anthropic.com

Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.

10

8

91