DanHendrycks Profile Banner
Dan Hendrycks Profile
Dan Hendrycks

@DanHendrycks

Followers
41K
Following
2K
Media
251
Statuses
1K

• Center for AI Safety Director • xAI and Scale AI advisor • GELU/MMLU/MATH/HLE • PhD in AI • Analyzing AI models, companies, policies, and geopolitics

San Francisco
Joined August 2009
Don't wanna be here? Send us removal request.
@DanHendrycks
Dan Hendrycks
4 months
Superintelligence is destabilizing. If China were on the cusp of building it first, Russia or the US would not sit idly by—they'd potentially threaten cyberattacks to deter its creation. @ericschmidt @alexandr_wang and I propose a new strategy for superintelligence. 🧵
Tweet media one
Tweet media two
77
135
739
@DanHendrycks
Dan Hendrycks
5 days
RT @jim_mitre: In a new paper about AGI and preventive war, @RANDCorporation colleagues argue that the probability of war is low in absolut….
0
12
0
@DanHendrycks
Dan Hendrycks
18 days
The moment I remember most from this series is
1
0
5
@DanHendrycks
Dan Hendrycks
18 days
War and Peace in the Nuclear Age is a good documentary series: I recommend all the episodes with >=3.8K views.
10
1
19
@DanHendrycks
Dan Hendrycks
20 days
That said, they still are worse than humans when the puzzles are represented pictorially (image below). The paper: Does Spatial Cognition Emerge in Frontier Models? (ICLR 2025). Thanks to @longphan3110 for running the evaluation
Tweet media one
2
4
86
@DanHendrycks
Dan Hendrycks
20 days
Apple recently published a paper showing that current AI systems lack the ability to solve puzzles that are easy for humans. Humans: 92.7%.GPT-4o: 69.9%. However, they didn't evaluate on any recent reasoning models. If they did, they'd find that.o3 gets 96.5%, beating humans.
Tweet media one
Tweet media two
42
105
953
@DanHendrycks
Dan Hendrycks
23 days
Some pointers for the topics that are more helpful:.Deep learning: Law: (week 1 is most helpful).Safety engineering: Complex systems: Game theory: Geopolitics:.
3
1
54
@DanHendrycks
Dan Hendrycks
23 days
Many fields seem useful for thinking about frontier AI strategically, but most have little to contribute. Surprisingly unhelpful:.* classic machine learning (e.g., SVMs, PGMs),.* statistics,.* machine learning theory,.* algorithms,.* optimization and control theory,.*.
36
19
349
@DanHendrycks
Dan Hendrycks
1 month
Does AI deterrence require precise redlines?. Nuclear, cyber, and criminal deterrence often have intentional ambiguity. The U.S. maintains a policy of strategic ambiguity on nuclear strikes, keeping open the option of a first strike for undefined conditions. Likewise, the U.S.
7
5
59
@DanHendrycks
Dan Hendrycks
1 month
0
4
20
@DanHendrycks
Dan Hendrycks
1 month
This is a strawman. We don't use the phrase "AGI" in the MAIM paper (Superintelligence Strategy). In fact, we discuss how the concept of AGI is too vague to be useful in the appendix. We make it clear that the first thing we want to deter is an intelligence recursion---thousands.
@ohlennart
Lennart Heim
4 months
The idea of a clear "AGI threshold" for preventive actions (MAIM paper) misses a challenge: we'll never agree when something becomes "superintelligent" or AGI. @ylecun will say, "It lacks autonomy!" while @GaryMarcus declares it's hitting a wall the next day. Some thoughts 1/.
12
5
120
@DanHendrycks
Dan Hendrycks
1 month
Examples of international AI redlines:. 1. Intelligence explosion redline. AIs might be able to improve AIs all by themselves in the next few years. The US and China should not want anybody to attempt an intelligence explosion where thousands of AIs are autonomously and rapidly.
7
8
113
@DanHendrycks
Dan Hendrycks
1 month
This depends on the ability to make AIs act as fiduciaries, which requires that AIs be reliably alignable. However, gradual disempowerment is usually posed as a problem that persists even if AIs are alignable.
1
0
12
@DanHendrycks
Dan Hendrycks
1 month
In Superintelligence Strategy, I made the point that (1) no unleashed AIs, (2) requiring AIs act as fiduciaries, (3) having AIs with forecasting abilities can help avoid erosion of control/evolutionary pressures/gradual disempowerment.
Tweet media one
Tweet media two
2
0
16
@DanHendrycks
Dan Hendrycks
1 month
We can prevent gradual disempowerment by AI. We delegate to doctors and lawyers yet can stay in charge because they must earn our informed consent. Requiring AIs to obtain informed consent and have foresight into long-term consequences helps ensure human control isn't eroded.
@DavidDuvenaud
David Duvenaud
1 month
What to do about gradual disempowerment? We laid out a research agenda with all the concrete and feasible research projects we can think of: 🧵. with @raymondadouglas @jankulveit @DavidSKrueger.
10
6
77
@DanHendrycks
Dan Hendrycks
1 month
This riffs on Dennett's Four Competences and Murray Gell-Man's multilevel analysis of adaptation and deception
Tweet media one
Tweet media two
Tweet media three
0
0
18
@DanHendrycks
Dan Hendrycks
1 month
Levels of AI control: infrastructure, behavior, cognition, and institutions. Infrastructure:.datacenter/chip off switch, thinking time allotment, tools an agent has access to, agent sandboxing, input and output filters.Behavior:.input-output behavior shaped by reinforcement or.
8
7
111
@DanHendrycks
Dan Hendrycks
2 months
Related post:.
1
0
5
@DanHendrycks
Dan Hendrycks
2 months
A presentation on evaluating AI models' general capabilities and dual-use capabilities.
3
4
29
@DanHendrycks
Dan Hendrycks
2 months
Alternative link:
3
0
17
@DanHendrycks
Dan Hendrycks
2 months
I wrote about why efforts to understand the inner workings of AI keep falling short.
@aif_media
AI Frontiers
2 months
32
60
342