Newton Cheng
@newton_cheng
Followers
77
Following
23
Media
0
Statuses
13
Frontier Red Team at @AnthropicAI | Physics PhD from @UCBerkeley
Joined March 2022
Huge amount of effort went into augmenting 4.5 for cyber use with our partners. We think that 4.5 is the best model for cyber defense now.
Something you may not know about Sonnet 4.5: it’s a special model for cybersecurity. For the past few months, the Frontier Red Team has been researching how to make models more useful for defenders. We now think we’re at an inflection point. New post on Red:
1
6
38
Read a bit about what my team has been working on for the past few months! It's been a crazy time, and it only gets crazier from here.
Something you may not know about Sonnet 4.5: it’s a special model for cybersecurity. For the past few months, the Frontier Red Team has been researching how to make models more useful for defenders. We now think we’re at an inflection point. New post on Red:
0
0
5
My team at @AnthropicAI is hiring research engineers and scientists. We find out whether AI models possess critical, advanced capabilities and then help the world to prepare. We'd love to hear from you! https://t.co/GldHfm5pRY
robertheaton.com
I work at Anthropic on the Frontier Red Team. Our mission is to find out whether AI models possess critical, advanced capabilities, and to help the world to prepare. We’re hiring AI researchers and...
4
15
120
🔥 I'm hiring exceptional research scientists + engineers for the Frontier Red Team at @AnthropicAI. AGI is a national security issue. We should push models to their limits and get an extra 1-2 year advantage. Links below.
26
61
853
Proud to announce that for much of this year Anthropic has been working with the National Nuclear Security Administration (NNSA, part of DOE) to test out whether LLMs like Claude know about dangerous things relating to nuclear weapons.
axios.com
The company says it believes the red-team exercise is the first time a frontier model has been used in top-secret work.
5
38
239
I'm looking forward to attending DEF CON with the @anthropicai team. We'll be hosting a happy hour on August 9 to meet the community.
luma.com
We're hosting an Anthropic happy hour at DEF CON this year! Join us to connect with other AI enthusiasts, and meet members of Anthropic's security and…
0
3
25
Evals are critical for measuring AI capabilities + safety. If you're building evals, I'd like you to apply for our support. Here are some we wish existed. https://t.co/bx32hW2VuH
anthropic.com
A robust, third-party evaluation ecosystem is essential for assessing AI capabilities and risks, but the current evaluations landscape is limited. Developing high-quality, safety-relevant evaluations...
6
60
351
Introducing Claude 3.5 Sonnet—our most intelligent model yet. This is the first release in our 3.5 model family. Sonnet now outperforms competitor models on key evaluations, at twice the speed of Claude 3 Opus and one-fifth the cost. Try it for free: https://t.co/uLbS2JMEK9
423
2K
7K
I’m hiring ambitious Research Scientists at @AnthropicAI to measure and prepare for models acting autonomously in the world. This is one of the most novel and difficult capabilities to measure, and critical for safety. Join the Frontier Red Team at Anthropic:
12
72
720
We’re hiring for the adversarial robustness team @AnthropicAI! As an Alignment subteam, we're making a big effort on red-teaming, test-time monitoring, and adversarial training. If you’re interested in these areas, let us know! (emails in 🧵)
4
72
454