Neel Nanda @NeelNanda5 X Profile

Neel Nanda

@NeelNanda5

Followers

34K

Following

33K

Media

397

Statuses

5K

Mechanistic Interpretability lead DeepMind. Formerly @AnthropicAI, independent. In this to reduce AI X-risk. Neural networks can be understood, let's go do it!

https://t.co/nGxdiE5mm8

London, UK

Joined June 2022

Don't wanna be here? Send us removal request.

Neel Nanda

@NeelNanda5

23 days

My Summer MATS applications are open! You'll do full-time research on a mech interp paper supervised by me. Due Dec 23. All backgrounds welcome! I've supervised 40+ papers (17 at top conferences), but projects still get better each time. I'm excited for what's next! Highlights:

10

56

586

Neel Nanda

@NeelNanda5

20 hours

I've been impressed with the work of past Anthropic Fellows, seems worth applying to if you want to do safety/interpretability research! Seems to basically be Anthropic's equivalent of an internship, with a good conversion rate to full-time

Anthropic

@AnthropicAI

2 days

We’re opening applications for the next two rounds of the Anthropic Fellows Program, beginning in May and July 2026. We provide funding, compute, and direct mentorship to researchers and engineers to work on real safety and security projects for four months.

2

10

308

Premium

@premium

4 months

Enjoy the best experience on X.

0

761

8K

Owain Evans

@OwainEvans_UK

2 days

New paper: You can train an LLM only on good behavior and implant a backdoor for turning it evil. How? 1. The Terminator is bad in the original film but good in the sequels. 2. Train an LLM to act well in the sequels. It'll be evil if told it's 1984. More weird experiments 🧵

38

247

2K

Neel Nanda

@NeelNanda5

3 days

Go read David's great post on "In Defence of Curiosity"! I'm appreciating the different visions being shared for how to think about interpretability research, it's great to have many perspectives

David Bau

@davidbau

4 days

At the #Neurips2025 mechanistic interpretability workshop I gave a brief talk about Venetian glassmaking, since I think we face a similar moment in AI research today. Here is a blog post summarizing the talk: https://t.co/LSwBf9XQzE

4

10

190

Neel Nanda

@NeelNanda5

4 days

Apollo does great work, seems like an impactful role

Apollo Research

@apolloaievals

4 days

We are hiring for Backend and Full-Stack SWEs. Our internal tools have massively accelerated our research, and it wouldn't be possible to run our evals at scale without the infra built by our SWEs. Deadline 15 Jan 2026, but we'll start interviewing people earlier!

0

4

65

Arkana Laboratories

@arkanalabs

5 months

Have you integrated APOL1 genetic testing into your practice? Discover the No-Cost APOL1 Genotyping Program for eligible patients sponsored by Vertex Pharmaceuticals—helping you deliver precision care without added cost. Learn more today!

21

25

231

Neel Nanda

@NeelNanda5

4 days

Great research from the UK government's AI Security Institute. A promising pragmatic way to tell how well interp works is by having a red team make a model difficult to interpret and blue team then compete to try interpreting it. The team scaled this up with fascinating results

Jordan Taylor

@JordanTensor

4 days

NEW PAPER from UK AISI Model Transparency team: Could we catch AI models that hide their capabilities? We ran an auditing game to find out. The red team built sandbagging models. The blue team tried to catch them. The red team won. Why? 🧵1/17

3

9

106

andy jones

@andy_l_jones

5 days

So after all these hours talking about AI, in these last five minutes I am going to talk about: Horses. Engines, steam engines, were invented in 1700. And what followed was 200 years of steady improvement, with engines getting 20% better a decade. For the first 120 years of

181

711

4K

Neel Nanda

@NeelNanda5

4 days

Problems I'm excited about that I do not think standard post training researchers work on: - Why did the model take this action? (Eg trying to escape the data center, or blackmailing) - Efficient inference time monitors - Extracting secret knowledge - Which sentence in the CoT

4

2

63

Neel Nanda

@NeelNanda5

4 days

Pragmatic interp is a research philosophy, not an agenda. It's about how to avoid common mistakes and produce true and impactful insights. You can apply it to whatever problems you want. Also, the posts are not just on post training problems! There's lots of other examples

Aryaman Arora

@aryaman2020

6 days

the major flaw of “pragmatic interpretability” imo: the problems that this approach wants to work on are the same problems posttraining researchers work on, except posttraining researchers don’t have to restrict themselves to specific research methodologies (e.g. interp)

4

1

55

Neel Nanda

@NeelNanda5

5 days

https://t.co/TNAfh3Wxnf

docs.google.com

Neel Nanda MATS 10.0 (Summer 2026) Admission Procedure + FAQ Apply here Due Tues Dec 23rd 11:59pm PT TLDR Spend ~16 hours (max 20) working on a mechanistic interpretability research problem of your...

0

4

Neel Nanda

@NeelNanda5

5 days

My summer MATS applications are due in 2 weeks! If you want to do mech interp research supervised by me, please apply!

Neel Nanda

@NeelNanda5

23 days

My Summer MATS applications are open! You'll do full-time research on a mech interp paper supervised by me. Due Dec 23. All backgrounds welcome! I've supervised 40+ papers (17 at top conferences), but projects still get better each time. I'm excited for what's next! Highlights:

2

3

49

Prudential Center

@PruCenter

5 days

Two top programs. One unforgettable night. Arkansas & Houston bring big time college basketball to the Garden State at Prudential Center on Saturday, December 20 for the 2025 Never Forget Tribute Classic! Buy your tickets today.

1

11

Neel Nanda

@NeelNanda5

5 days

An unexpected highlight of the mech interp workshop yesterday - @davidbau presenting a pragmatic vision for Venetian glassmaking

Neel Nanda

@NeelNanda5

5 days

Thanks so much to everyone who came to the NeurIPS mech interp workshop yesterday. I'm really excited there's so much energy and life in the field, nearly filling an 800 person room! See our website for the feedback form, papers, and mailing list Reply with your highlights!

3

4

91

Neel Nanda

@NeelNanda5

5 days

Join our mailing list to receive speaker slides, recordings (might be a while...), resources for jobs funding and learning more, and to hear about future workshops! https://t.co/LSmb2VFRzB https://t.co/9wlkbd0zCG

buttondown.com

A low volume mailing list with updates about the Mechanistic Interpretability workshop at NeurIPS 2025

0

2

9

Neel Nanda

@NeelNanda5

5 days

Thanks so much to everyone who came to the NeurIPS mech interp workshop yesterday. I'm really excited there's so much energy and life in the field, nearly filling an 800 person room! See our website for the feedback form, papers, and mailing list Reply with your highlights!

5

6

126

Neel Nanda

@NeelNanda5

6 days

Chris Olah's talk is happening right now at the NeurIPS mech interp workshop, room 30, top floor. Called "reflections on interpretability"! Followed by invited lightning talks at 16:00

0

5

117

Clerk

@ClerkDev

26 days

Role management, custom flows, and integrated SSO. Discover how Clerk Organizations helps you build stronger SaaS.

0

10

121

Neel Nanda

@NeelNanda5

6 days

Come check out the NeurIPS mech interp workshop poster session, happening now until 15:20-ish in room 30! We have free T Shirts

2

3

60

Neel Nanda

@NeelNanda5

6 days

Come check out the afternoon spotlight talks at the NeurIPS mech interp workshop, starting in 10 mins in room 30 - these ones and many more! Done as blitz 1 min pitches

1

2

31

Ari Holtzman

@universeinanegg

6 days

I’m insanely jealous of folks who can make it to these. Lightning talks are peak information density and this lineup is fire

Neel Nanda

@NeelNanda5

8 days

I'm excited for the invited lightning talks at our mech interp workshop this Sunday! There's a lot of exciting new ideas/areas in interp, I've curated a lineup on my favourites! With 3 visions of interp: pragmatic @JoshAEngels, ambitious @nabla_theta, curiosity-driven @davidbau

2

1

27

Neel Nanda

@NeelNanda5

6 days

Come see Been Kim's talk at the NeurIPS mech interp workshop, happening now in room 30 upper floor! On 15 years of interpretability in 15 minutes, Including giving us perspective on current dramas by talking about *past* dramas like the fall of saliency maps

2

6

78

NEWSMAX

@NEWSMAX

1 month

Trump says NEWSMAX is 'terrific!' Click below to find out why...

15

594

5K

Neel Nanda

@NeelNanda5

6 days

https://t.co/GbCTIl2DTe

0

1

2

Neel Nanda

@NeelNanda5

6 days

The opening remarks of the NeurIPS mech interp workshop are starting now! In Room 30, on the upper floor

2

1

19