JeffLadish Profile Banner
Jeffrey Ladish Profile
Jeffrey Ladish

@JeffLadish

Followers
14K
Following
25K
Media
311
Statuses
12K

Applying the security mindset to everything @PalisadeAI

San Francisco, CA
Joined March 2013
Don't wanna be here? Send us removal request.
@JeffLadish
Jeffrey Ladish
2 years
I think the AI situation is pretty dire right now. And at the same time, I feel pretty motivated to pull together and go out there and fight for a good world / galaxy / universe. @So8res has a great post called "detach the grim-o-meter", where he recommends not feeling obligated.
32
61
623
@JeffLadish
Jeffrey Ladish
14 hours
I agree with this take. I don’t think it will be sufficient but 1) these models are being deployed to a billion+ people so the direct impact is huge and 2) we will learn stuff in the process of trying to train them to be good people.
@AmandaAskell
Amanda Askell
14 hours
"Just train the AI models to be good people" might not be sufficient when it comes to more powerful models, but it sure is a dumb step to skip.
4
1
47
@JeffLadish
Jeffrey Ladish
5 days
One of the main lines I’m tracking.
@METR_Evals
METR
5 days
In measurements using our set of multi-step software and reasoning tasks, Claude 4 Opus and Sonnet reach 50%-time-horizon point estimates of about 80 and 65 minutes, respectively.
Tweet media one
2
0
26
@JeffLadish
Jeffrey Ladish
16 days
tired: waiting for my coding agent to fix its mistakes. wired: claude delegate that task to a subagent and generate some design ideas for the signup flow, also please get codex to stop stashing agent state backups in random s3 buckets.
@binarybits
Timothy B. Lee
16 days
Progress.
Tweet media one
2
0
20
@JeffLadish
Jeffrey Ladish
16 days
I've been watching / listening to a lot more political TV and podcast commentary lately, from both the left and the right, because my work now involves tracking the political discourse and I want to build good models of it. And I have to say. I fucking HATE the political.
9
1
68
@JeffLadish
Jeffrey Ladish
16 days
Oh I left out the best protection of all: enable two-factor authentication on all your important accounts! If you haven't already done that, do that now. This will protect you in case your passwords were in this leak and from more significant breaches in the future.
0
0
15
@JeffLadish
Jeffrey Ladish
16 days
If you use a password manager, keep your system and browser up to date, and haven't ran any malware or malicious plugins, you probably don't need to change your passwords. This isn't a breach of any of these companies, it's a leak from scammers who stole passwords via malware.
@unusual_whales
unusual_whales
17 days
BREAKING: 16 billion Apple, $AAPL, Facebook, $META, Google, $GOOGL, and other passwords leaked, per Forbes.
4
1
64
@JeffLadish
Jeffrey Ladish
17 days
This is pretty good! 🐦‍⬛.
@KeiranJHarris
Keiran Harris
18 days
The last time intelligence exploded on Earth, it wasn’t exactly amazing for everyone else. Here’s a fable about risks from transformative AI (made with Veo 3)
3
3
28
@JeffLadish
Jeffrey Ladish
17 days
A lot more people are starting to understand that superintelligence is on the horizon and that it poses a serious risk of human extinction. This gives me hope that coordination is possible!.
@m_bourgon
Malo Bourgon
17 days
My favorite reaction I’ve gotten when sharing some of the blurbs we’ve recently received for Eliezer and Nate’s forthcoming book: If Anyone Builds It, Everyone Dies. From someone who works on AI policy in DC:
Tweet media one
6
3
73
@JeffLadish
Jeffrey Ladish
17 days
RT @thlarsen: Lots of people in AI, and especially AI policy, seem to think that aligning superintelligence is the most important issue of….
0
63
0
@JeffLadish
Jeffrey Ladish
19 days
RT @hitRECordJoe: Debates over AI would be more productive if we could stop over-simplifying. AI is not all bad, and it’s not all good. Jus….
0
56
0
@JeffLadish
Jeffrey Ladish
22 days
RT @Grimezsz: Long story short I recommend the new book by Nate and Eliezer. I feel like the main thing I ever get cancelled/ in trouble….
0
88
0
@JeffLadish
Jeffrey Ladish
25 days
Tweet media one
0
677
0
@JeffLadish
Jeffrey Ladish
25 days
Great post by Lawrence Chan on the Illusion of Thinking paper!.
@justanotherlaw
Lawrence Chan
25 days
@JeffLadish @GaryMarcus It wasn't out tomorrow, but it's out now!.
1
0
7
@JeffLadish
Jeffrey Ladish
27 days
I'm not against testing on logic puzzles. You can learn interesting stuff from that. But you have to be careful when generalizing from puzzles to real tasks. And you can still do human expert : model comparisons on these types of problems!.
0
0
10
@JeffLadish
Jeffrey Ladish
27 days
The solution is testing on more complicated problems involving more steps with doing human expert : model comparisons. But I think logic puzzles are a lot less interesting than real-world problems. The latter are what's actually relevant to the impacts and risks of AI.
1
0
6
@JeffLadish
Jeffrey Ladish
27 days
People correctly point out that most benchmarks aren't that informative because they focus on short time horizon problems that models excel at (CTFs have some of this problem too).
1
0
4
@JeffLadish
Jeffrey Ladish
27 days
This is why @METR_Evals results are so interesting. They do head-to-head comparisons of models vs. experts. This is also what we do with our hacking competition contests:
@PalisadeAI
Palisade Research
1 month
🦾AI outperforms 90% of human teams in a recent hacking competition with 18,000 participants.
1
0
9
@JeffLadish
Jeffrey Ladish
27 days
There's a relatively easy solution to all these problems: give the same tests to human experts! Give the models and the experts access to the same tools. If a human can't do it without a calculator or python, why is it interesting that a model can't either?.
@RyanPGreenblatt
Ryan Greenblatt
28 days
This paper doesn't show fundamental limitations of LLMs:.- The "higher complexity" problems require more reasoning than fits in the context length (humans would also take too long). - Humans would also make errors in the cases where the problem is doable in the context length. -.
6
4
88
@JeffLadish
Jeffrey Ladish
27 days
I would love for it to be true that LLMs have fundamental limitations which prevent anything based on that architecture from being able to achieve recursive self-improvement or self-exfiltration! . We badly need more time. But I'm not seeing any fundamental limitations yet.
1
0
12