NeelNanda5 Profile Banner
Neel Nanda Profile
Neel Nanda

@NeelNanda5

Followers
34K
Following
33K
Media
397
Statuses
5K

Mechanistic Interpretability lead DeepMind. Formerly @AnthropicAI, independent. In this to reduce AI X-risk. Neural networks can be understood, let's go do it!

London, UK
Joined June 2022
Don't wanna be here? Send us removal request.
@NeelNanda5
Neel Nanda
23 days
My Summer MATS applications are open! You'll do full-time research on a mech interp paper supervised by me. Due Dec 23. All backgrounds welcome! I've supervised 40+ papers (17 at top conferences), but projects still get better each time. I'm excited for what's next! Highlights:
10
56
586
@NeelNanda5
Neel Nanda
20 hours
I've been impressed with the work of past Anthropic Fellows, seems worth applying to if you want to do safety/interpretability research! Seems to basically be Anthropic's equivalent of an internship, with a good conversion rate to full-time
@AnthropicAI
Anthropic
2 days
We’re opening applications for the next two rounds of the Anthropic Fellows Program, beginning in May and July 2026. We provide funding, compute, and direct mentorship to researchers and engineers to work on real safety and security projects for four months.
2
10
308
@premium
Premium
4 months
Enjoy the best experience on X.
0
761
8K
@OwainEvans_UK
Owain Evans
2 days
New paper: You can train an LLM only on good behavior and implant a backdoor for turning it evil. How? 1. The Terminator is bad in the original film but good in the sequels. 2. Train an LLM to act well in the sequels. It'll be evil if told it's 1984. More weird experiments đź§µ
38
247
2K
@NeelNanda5
Neel Nanda
3 days
Go read David's great post on "In Defence of Curiosity"! I'm appreciating the different visions being shared for how to think about interpretability research, it's great to have many perspectives
@davidbau
David Bau
4 days
At the #Neurips2025 mechanistic interpretability workshop I gave a brief talk about Venetian glassmaking, since I think we face a similar moment in AI research today. Here is a blog post summarizing the talk: https://t.co/LSwBf9XQzE
4
10
190
@NeelNanda5
Neel Nanda
4 days
Apollo does great work, seems like an impactful role
@apolloaievals
Apollo Research
4 days
We are hiring for Backend and Full-Stack SWEs. Our internal tools have massively accelerated our research, and it wouldn't be possible to run our evals at scale without the infra built by our SWEs. Deadline 15 Jan 2026, but we'll start interviewing people earlier!
0
4
65
@arkanalabs
Arkana Laboratories
5 months
Have you integrated APOL1 genetic testing into your practice? Discover the No-Cost APOL1 Genotyping Program for eligible patients sponsored by Vertex Pharmaceuticals—helping you deliver precision care without added cost. Learn more today!
21
25
231
@NeelNanda5
Neel Nanda
4 days
Great research from the UK government's AI Security Institute. A promising pragmatic way to tell how well interp works is by having a red team make a model difficult to interpret and blue team then compete to try interpreting it. The team scaled this up with fascinating results
@JordanTensor
Jordan Taylor
4 days
NEW PAPER from UK AISI Model Transparency team: Could we catch AI models that hide their capabilities? We ran an auditing game to find out. The red team built sandbagging models. The blue team tried to catch them. The red team won. Why? đź§µ1/17
3
9
106
@andy_l_jones
andy jones
5 days
So after all these hours talking about AI, in these last five minutes I am going to talk about: Horses. Engines, steam engines, were invented in 1700. And what followed was 200 years of steady improvement, with engines getting 20% better a decade. For the first 120 years of
181
711
4K
@NeelNanda5
Neel Nanda
4 days
Problems I'm excited about that I do not think standard post training researchers work on: - Why did the model take this action? (Eg trying to escape the data center, or blackmailing) - Efficient inference time monitors - Extracting secret knowledge - Which sentence in the CoT
4
2
63
@NeelNanda5
Neel Nanda
4 days
Pragmatic interp is a research philosophy, not an agenda. It's about how to avoid common mistakes and produce true and impactful insights. You can apply it to whatever problems you want. Also, the posts are not just on post training problems! There's lots of other examples
@aryaman2020
Aryaman Arora
6 days
the major flaw of “pragmatic interpretability” imo: the problems that this approach wants to work on are the same problems posttraining researchers work on, except posttraining researchers don’t have to restrict themselves to specific research methodologies (e.g. interp)
4
1
55
@NeelNanda5
Neel Nanda
5 days
My summer MATS applications are due in 2 weeks! If you want to do mech interp research supervised by me, please apply!
@NeelNanda5
Neel Nanda
23 days
My Summer MATS applications are open! You'll do full-time research on a mech interp paper supervised by me. Due Dec 23. All backgrounds welcome! I've supervised 40+ papers (17 at top conferences), but projects still get better each time. I'm excited for what's next! Highlights:
2
3
49
@PruCenter
Prudential Center
5 days
Two top programs. One unforgettable night. Arkansas & Houston bring big time college basketball to the Garden State at Prudential Center on Saturday, December 20 for the 2025 Never Forget Tribute Classic! Buy your tickets today.
1
1
11
@NeelNanda5
Neel Nanda
5 days
An unexpected highlight of the mech interp workshop yesterday - @davidbau presenting a pragmatic vision for Venetian glassmaking
@NeelNanda5
Neel Nanda
5 days
Thanks so much to everyone who came to the NeurIPS mech interp workshop yesterday. I'm really excited there's so much energy and life in the field, nearly filling an 800 person room! See our website for the feedback form, papers, and mailing list Reply with your highlights!
3
4
91
@NeelNanda5
Neel Nanda
5 days
Join our mailing list to receive speaker slides, recordings (might be a while...), resources for jobs funding and learning more, and to hear about future workshops! https://t.co/LSmb2VFRzB https://t.co/9wlkbd0zCG
Tweet card summary image
buttondown.com
A low volume mailing list with updates about the Mechanistic Interpretability workshop at NeurIPS 2025
0
2
9
@NeelNanda5
Neel Nanda
5 days
Thanks so much to everyone who came to the NeurIPS mech interp workshop yesterday. I'm really excited there's so much energy and life in the field, nearly filling an 800 person room! See our website for the feedback form, papers, and mailing list Reply with your highlights!
5
6
126
@NeelNanda5
Neel Nanda
6 days
Chris Olah's talk is happening right now at the NeurIPS mech interp workshop, room 30, top floor. Called "reflections on interpretability"! Followed by invited lightning talks at 16:00
0
5
117
@ClerkDev
Clerk
26 days
Role management, custom flows, and integrated SSO. Discover how Clerk Organizations helps you build stronger SaaS.
0
10
121
@NeelNanda5
Neel Nanda
6 days
Come check out the NeurIPS mech interp workshop poster session, happening now until 15:20-ish in room 30! We have free T Shirts
2
3
60
@NeelNanda5
Neel Nanda
6 days
Come check out the afternoon spotlight talks at the NeurIPS mech interp workshop, starting in 10 mins in room 30 - these ones and many more! Done as blitz 1 min pitches
1
2
31
@universeinanegg
Ari Holtzman
6 days
I’m insanely jealous of folks who can make it to these. Lightning talks are peak information density and this lineup is fire
@NeelNanda5
Neel Nanda
8 days
I'm excited for the invited lightning talks at our mech interp workshop this Sunday! There's a lot of exciting new ideas/areas in interp, I've curated a lineup on my favourites! With 3 visions of interp: pragmatic @JoshAEngels, ambitious @nabla_theta, curiosity-driven @davidbau
2
1
27
@NeelNanda5
Neel Nanda
6 days
Come see Been Kim's talk at the NeurIPS mech interp workshop, happening now in room 30 upper floor! On 15 years of interpretability in 15 minutes, Including giving us perspective on current dramas by talking about *past* dramas like the fall of saliency maps
2
6
78
@NEWSMAX
NEWSMAX
1 month
Trump says NEWSMAX is 'terrific!' Click below to find out why...
15
594
5K
@NeelNanda5
Neel Nanda
6 days
The opening remarks of the NeurIPS mech interp workshop are starting now! In Room 30, on the upper floor
2
1
19