Daniela Amodei @DanielaAmodei X Profile

Daniela Amodei

@DanielaAmodei

Followers

10K

Following

255

Media

1

Statuses

29

President @AnthropicAI. Formerly @OpenAI, @Stripe, congressional staffer, global development

https://t.co/446mzeAYOk

San Francisco, CA

Joined September 2011

Don't wanna be here? Send us removal request.

Anthropic

@AnthropicAI

2 years

Introducing Claude 2! Our latest model has improved performance in coding, math and reasoning. It can produce longer responses, and is available in a new public-facing beta website at https://t.co/uLbS2JNczH in the US and UK.

163

507

2K

Anthropic

@AnthropicAI

3 years

Neural networks often pack many unrelated concepts into a single neuron – a puzzling phenomenon known as 'polysemanticity' which makes interpretability much more challenging. In our latest work, we build toy models where the origins of polysemanticity can be fully understood.

54

637

4K

Anthropic

@AnthropicAI

3 years

In "Language Models (Mostly) Know What They Know", we show that language models can evaluate whether what they say is true, and predict ahead of time whether they'll be able to answer questions correctly. https://t.co/ZmZJqycb39

17

156

914

Anthropic

@AnthropicAI

3 years

Transformer MLP neurons are challenging to understand. We find that using a different activation function (Softmax Linear Units or SoLU) increases the fraction of neurons that appear to respond to understandable features without any performance penalty. https://t.co/5ew6iWHYtl

10

70

373

Anthropic

@AnthropicAI

3 years

In a new paper, we show that repeating only a small fraction of the data used to train a language model (albeit many times) can damage performance significantly, and we observe a "double descent" phenomenon associated with this. https://t.co/kkUdDSRJTI

6

41

324

Daniela Amodei

@DanielaAmodei

3 years

I’m looking forward to what’s to come. And we’re hiring!

anthropic.com

Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.

3

1

20

Daniela Amodei

@DanielaAmodei

3 years

As well as steerability and robustness - https://t.co/gVvzRHUxfS - reinforcement learning - https://t.co/LDMFcSOYRV, societal impacts - https://t.co/QwJQzaboBK, and more!

arxiv.org

Large-scale pre-training has recently emerged as a technique for creating capable, general purpose, generative models such as GPT-3, Megatron-Turing NLG, Gopher, and many others. In this paper, we...

1

0

17

Daniela Amodei

@DanielaAmodei

3 years

This includes work on interpretability - https://t.co/wDhLbDY6Vi and https://t.co/yCiURyUZ8d - and some interpretability resources like Garcon https://t.co/9MNd6adVAI and Pysvelte

github.com

A library for bridging Python and HTML/Javascript (via Svelte) for creating interactive visualizations - anthropics/PySvelte

1

0

6

Daniela Amodei

@DanielaAmodei

3 years

I’m so proud of the amazing team we’ve assembled at Anthropic and the research we’ve done to date.

1

0

9

Daniela Amodei

@DanielaAmodei

3 years

Excited to announce our latest fundraising round! We’re genuinely honored to be entrusted with the resources to continue our work in frontier AI safety and research.

Anthropic

@AnthropicAI

3 years

We’ve raised $580 million in a Series B. This will help us further develop our research to build usable, reliable AI systems. Find out more:

8

5

82

Anthropic

@AnthropicAI

3 years

Glad @QuantaMagazine highlights progress on induction heads/rigorous interpretability by @ch402, @catherineols, @nelhage and others @AnthropicAI. More to come! https://t.co/CJXjQlKbeo

quantamagazine.org

Language processing programs are notoriously hard to interpret, but smaller versions can provide important insights into how they work.

0

11

75

Anthropic

@AnthropicAI

3 years

We've trained a natural language assistant to be more helpful and harmless by using reinforcement learning with human feedback (RLHF). https://t.co/1lwcEASXvA

3

51

268

Anthropic

@AnthropicAI

4 years

On the @FLIxrisk podcast, we discuss AI research, AI safety, and what it was like starting Anthropic during COVID.

futureoflife.org

Daniela and Dario Amodei join the FLI Podcast to discuss Anthropic, a new company building safe, reliable, interpretable, and steerable AI systems.

3

9

48

Anthropic

@AnthropicAI

4 years

In our second interpretability paper, we revisit “induction heads”. In 2+ layer transformers these pattern-completion heads form exactly when in-context learning abruptly improves. Are they responsible for most in-context learning in large transformers? https://t.co/28WWkMrjQm

1

57

307

Anthropic

@AnthropicAI

4 years

Our first societal impacts paper explores the technical traits of large generative models and the motivations and challenges people face in building and deploying them:

arxiv.org

Large-scale pre-training has recently emerged as a technique for creating capable, general purpose, generative models such as GPT-3, Megatron-Turing NLG, Gopher, and many others. In this paper, we...

2

33

148

Anthropic

@AnthropicAI

4 years

Our first interpretability paper explores a mathematical framework for trying to reverse engineer transformer language models: A Mathematical Framework for Transformer Circuits:

3

115

610

Anthropic

@AnthropicAI

4 years

Our first AI alignment paper, focused on simple baselines and investigations: A General Language Assistant as a Laboratory for Alignment

arxiv.org

Given the broad capabilities of large language models, it should be possible to work towards a general-purpose, text-based assistant that is aligned with human values, meaning that it is helpful,...

5

60

323

Daniela Amodei

@DanielaAmodei

4 years

We’re going to be focused on pushing forward our research for the next few months and are hoping to have more to share later this year. Thrilled to be working with so many talented colleagues!

6

1

26

Daniela Amodei

@DanielaAmodei

4 years

Excited to announce what we’ve been working on this year - @AnthropicAI, an AI safety and research company. If you’d like to help us combine safety research with scaling ML models while thinking about societal impacts, check out our careers page

11

26

197

Anthropic

@AnthropicAI

4 years

Hello world! You can read our launch announcement here:

4

34

291