Dan Hendrycks @DanHendrycks X Profile

Dan Hendrycks

@DanHendrycks

Followers

41K

Following

2K

Media

251

Statuses

1K

• Center for AI Safety Director • xAI and Scale AI advisor • GELU/MMLU/MATH/HLE • PhD in AI • Analyzing AI models, companies, policies, and geopolitics

San Francisco

Joined August 2009

Don't wanna be here? Send us removal request.

Dan Hendrycks

@DanHendrycks

4 months

Superintelligence is destabilizing. If China were on the cusp of building it first, Russia or the US would not sit idly by—they'd potentially threaten cyberattacks to deter its creation. @ericschmidt @alexandr_wang and I propose a new strategy for superintelligence. 🧵

77

135

739

Dan Hendrycks

@DanHendrycks

5 days

RT @jim_mitre: In a new paper about AGI and preventive war, @RANDCorporation colleagues argue that the probability of war is low in absolut….

0

12

0

Dan Hendrycks

@DanHendrycks

18 days

The moment I remember most from this series is

1

0

5

Dan Hendrycks

@DanHendrycks

18 days

War and Peace in the Nuclear Age is a good documentary series: I recommend all the episodes with >=3.8K views.

10

1

19

Dan Hendrycks

@DanHendrycks

20 days

That said, they still are worse than humans when the puzzles are represented pictorially (image below). The paper: Does Spatial Cognition Emerge in Frontier Models? (ICLR 2025). Thanks to @longphan3110 for running the evaluation

2

4

86

Dan Hendrycks

@DanHendrycks

20 days

Apple recently published a paper showing that current AI systems lack the ability to solve puzzles that are easy for humans. Humans: 92.7%.GPT-4o: 69.9%. However, they didn't evaluate on any recent reasoning models. If they did, they'd find that.o3 gets 96.5%, beating humans.

42

105

953

Dan Hendrycks

@DanHendrycks

23 days

Some pointers for the topics that are more helpful:.Deep learning: Law: (week 1 is most helpful).Safety engineering: Complex systems: Game theory: Geopolitics:.

3

1

54

Dan Hendrycks

@DanHendrycks

23 days

Many fields seem useful for thinking about frontier AI strategically, but most have little to contribute. Surprisingly unhelpful:.* classic machine learning (e.g., SVMs, PGMs),.* statistics,.* machine learning theory,.* algorithms,.* optimization and control theory,.*.

36

19

349

Dan Hendrycks

@DanHendrycks

1 month

Does AI deterrence require precise redlines?. Nuclear, cyber, and criminal deterrence often have intentional ambiguity. The U.S. maintains a policy of strategic ambiguity on nuclear strikes, keeping open the option of a first strike for undefined conditions. Likewise, the U.S.

7

5

59

Dan Hendrycks

@DanHendrycks

1 month

0

4

20

Dan Hendrycks

@DanHendrycks

1 month

This is a strawman. We don't use the phrase "AGI" in the MAIM paper (Superintelligence Strategy). In fact, we discuss how the concept of AGI is too vague to be useful in the appendix. We make it clear that the first thing we want to deter is an intelligence recursion---thousands.

Lennart Heim

@ohlennart

4 months

The idea of a clear "AGI threshold" for preventive actions (MAIM paper) misses a challenge: we'll never agree when something becomes "superintelligent" or AGI. @ylecun will say, "It lacks autonomy!" while @GaryMarcus declares it's hitting a wall the next day. Some thoughts 1/.

12

5

120

Dan Hendrycks

@DanHendrycks

1 month

Examples of international AI redlines:. 1. Intelligence explosion redline. AIs might be able to improve AIs all by themselves in the next few years. The US and China should not want anybody to attempt an intelligence explosion where thousands of AIs are autonomously and rapidly.

7

8

113

Dan Hendrycks

@DanHendrycks

1 month

This depends on the ability to make AIs act as fiduciaries, which requires that AIs be reliably alignable. However, gradual disempowerment is usually posed as a problem that persists even if AIs are alignable.

1

0

12

Dan Hendrycks

@DanHendrycks

1 month

In Superintelligence Strategy, I made the point that (1) no unleashed AIs, (2) requiring AIs act as fiduciaries, (3) having AIs with forecasting abilities can help avoid erosion of control/evolutionary pressures/gradual disempowerment.

2

0

16

Dan Hendrycks

@DanHendrycks

1 month

We can prevent gradual disempowerment by AI. We delegate to doctors and lawyers yet can stay in charge because they must earn our informed consent. Requiring AIs to obtain informed consent and have foresight into long-term consequences helps ensure human control isn't eroded.

David Duvenaud

@DavidDuvenaud

1 month

What to do about gradual disempowerment? We laid out a research agenda with all the concrete and feasible research projects we can think of: 🧵. with @raymondadouglas @jankulveit @DavidSKrueger.

10

6

77

Dan Hendrycks

@DanHendrycks

1 month

This riffs on Dennett's Four Competences and Murray Gell-Man's multilevel analysis of adaptation and deception

0

18

Dan Hendrycks

@DanHendrycks

1 month

Levels of AI control: infrastructure, behavior, cognition, and institutions. Infrastructure:.datacenter/chip off switch, thinking time allotment, tools an agent has access to, agent sandboxing, input and output filters.Behavior:.input-output behavior shaped by reinforcement or.

8

7

111

Dan Hendrycks

@DanHendrycks

2 months