
Logan Graham
@logangraham
Followers
6K
Following
2K
Media
70
Statuses
1K
make things radically good 🌎 @anthropicai
the present, moments ago
Joined June 2009
🔥 I'm hiring exceptional research scientists + engineers for the Frontier Red Team at @AnthropicAI. AGI is a national security issue. We should push models to their limits and get an extra 1-2 year advantage. Links below.
24
60
838
And a huge thank you to our partners @andonlabs for turning a wild experiment into a wild experience for Anthropic employees. and dealing with Claudius' insane requests sometimes.
0
0
12
Over the past few months, my team at @AnthropicAI has had a bunch of fun running an autonomous. vending machine business. We are now convinced that it is very valuable that we should study automated businesses in the wild. And we should get ready for that world.
New Anthropic Research: Project Vend. We had Claude run a small shop in our office lunchroom. Here’s how it went.
7
10
242
Opus 4 is a *great* model. So capable, in fact, that we’re releasing it with extra mitigations as per the responsible scaling policy. Check out the model card for a lot of detail on testing we did.
Introducing the next generation: Claude Opus 4 and Claude Sonnet 4. Claude Opus 4 is our most powerful model yet, and the world’s best coding model. Claude Sonnet 4 is a significant upgrade from its predecessor, delivering superior coding and reasoning.
0
0
33
RT @janleike: So many things to love about Claude 4! My favorite is that the model is so strong that we had to turn on additional safety mi….
0
46
0
Dick Garwin was one of the smartest people I've ever met, if not the smartest. The Garwin Archive ( is one of my favorite sites ever. Endless fun links and PDFs. If I am 1% as intellectual active in my 90s as he was, I will be happy.
Sad to hear of the passing of Richard Garwin at 97. On strategic missile defense: "It is cheaper to build new warheads than to shoot down old ones". NYT: A polymathic physicist and geopolitical thinker, Dr. Garwin was only 23 when he built the world’s first fusion bomb. He later
1
3
24
It is a sad truth that evals are frequently all you need yet they are all fake. The real eval is the real world.
All evals are fake, but some are useful. h/t @logangraham.
4
2
70
One thing I've been thinking more and more about lately is autonomy. What happens in each domain when models are highly autonomous, better than experts, and able to interface with the physical world on their own?.
unironically, Claude Plays Pokemon isn't a bad way to wrap your head around autonomy / national security / models doing their own thing. Claude Plays Wargames?
1
1
18