Newton Cheng @newton_cheng X Profile

Newton Cheng

@newton_cheng

Followers

77

Following

23

Media

0

Statuses

13

Frontier Red Team at @AnthropicAI | Physics PhD from @UCBerkeley

Joined March 2022

Don't wanna be here? Send us removal request.

Jason D. Clinton 🔸

@JasonDClinton

1 month

Huge amount of effort went into augmenting 4.5 for cyber use with our partners. We think that 4.5 is the best model for cyber defense now.

Logan Graham

@logangraham

1 month

Something you may not know about Sonnet 4.5: it’s a special model for cybersecurity. For the past few months, the Frontier Red Team has been researching how to make models more useful for defenders. We now think we’re at an inflection point. New post on Red:

1

6

38

Newton Cheng

@newton_cheng

1 month

Read a bit about what my team has been working on for the past few months! It's been a crazy time, and it only gets crazier from here.

Logan Graham

@logangraham

1 month

Something you may not know about Sonnet 4.5: it’s a special model for cybersecurity. For the past few months, the Frontier Red Team has been researching how to make models more useful for defenders. We now think we’re at an inflection point. New post on Red:

0

5

Newton Cheng

@newton_cheng

7 months

Read a little bit about what we've been working on!

Logan Graham

@logangraham

8 months

Models are getting better across the board in national security-relevant domains. In a new blog post, we wrote some reflections from the past year of red teaming models in these domains.

0

1

7

Robert Heaton

@RobJHeaton

9 months

My team at @AnthropicAI is hiring research engineers and scientists. We find out whether AI models possess critical, advanced capabilities and then help the world to prepare. We'd love to hear from you! https://t.co/GldHfm5pRY

robertheaton.com

I work at Anthropic on the Frontier Red Team. Our mission is to find out whether AI models possess critical, advanced capabilities, and to help the world to prepare. We’re hiring AI researchers and...

4

15

120

Logan Graham

@logangraham

9 months

🔥 I'm hiring exceptional research scientists + engineers for the Frontier Red Team at @AnthropicAI. AGI is a national security issue. We should push models to their limits and get an extra 1-2 year advantage. Links below.

26

61

853

Jack Clark

@jackclarkSF

1 year

Proud to announce that for much of this year Anthropic has been working with the National Nuclear Security Administration (NNSA, part of DOE) to test out whether LLMs like Claude know about dangerous things relating to nuclear weapons.

axios.com

The company says it believes the red-team exercise is the first time a frontier model has been used in top-secret work.

5

38

239

Jason D. Clinton 🔸

@JasonDClinton

1 year

I'm looking forward to attending DEF CON with the @anthropicai team. We'll be hosting a happy hour on August 9 to meet the community.

luma.com

We're hosting an Anthropic happy hour at DEF CON this year! Join us to connect with other AI enthusiasts, and meet members of Anthropic's security and…

0

3

25

Logan Graham

@logangraham

1 year

Evals are critical for measuring AI capabilities + safety. If you're building evals, I'd like you to apply for our support. Here are some we wish existed. https://t.co/bx32hW2VuH

anthropic.com

A robust, third-party evaluation ecosystem is essential for assessing AI capabilities and risks, but the current evaluations landscape is limited. Developing high-quality, safety-relevant evaluations...

6

60

351

Anthropic

@AnthropicAI

1 year

Introducing Claude 3.5 Sonnet—our most intelligent model yet. This is the first release in our 3.5 model family. Sonnet now outperforms competitor models on key evaluations, at twice the speed of Claude 3 Opus and one-fifth the cost. Try it for free: https://t.co/uLbS2JMEK9

423

2K

7K

Logan Graham

@logangraham

1 year

I’m hiring ambitious Research Scientists at @AnthropicAI to measure and prepare for models acting autonomously in the world. This is one of the most novel and difficult capabilities to measure, and critical for safety. Join the Frontier Red Team at Anthropic:

12

72

720

Jesse Mu

@jayelmnop

2 years

We’re hiring for the adversarial robustness team @AnthropicAI! As an Alignment subteam, we're making a big effort on red-teaming, test-time monitoring, and adversarial training. If you’re interested in these areas, let us know! (emails in 🧵)

4

72

454