Max Nadeau
@MaxNadeau_
Followers
1K
Following
8K
Media
19
Statuses
409
Funding research to make AIs more understandable, truthful, and dependable at @open_phil.
Berkeley, CA
Joined November 2017
đ§” Announcing @open_phil's Technical AI Safety RFP! We're seeking proposals across 21 research areas to help make AI systems more trustworthy, rule-following, and aligned, even as they become more capable.
4
84
251
Here is the posting! https://t.co/K3TyrKoT8i
(1/8) Open Philanthropyâs Technical AI Safety team is recruiting grantmakers to support research aimed at reducing catastrophic risks from advanced AI. Weâre hiring at all levels of seniority.
0
0
5
And OP is a great place to workâwe take impact very seriously and we make big things happen in the world. I really like my coworkers; they're sharp, easy to work with, and put the mission first.
1
1
17
And to be clear, the above tweet is only about OP's Technical AI Safety team. Our purview is funding technical research, mostly ML, related to AI safety/security/interp/alignment/etc. Other teams at OP fund different work than we do, like AI policy research.
1
0
16
In 2024, OP's Technical AI Safety team had 2 grantmakers and spent $40m. In 2025, we had 3 and spent $130m. If you join the team, it will enable us to spend even more next year, and weâll be directly influenced by your takes. Come work with me!
4
8
85
If the water usage wasn't bad enough, now we're learning that AI uses non-commutative operationsâheaven forfend!
8
31
855
Reading DeepSeek CoTs now requires expertise in multiple languages/cultures
I went down a little rabbit hole to read Deepseek V3âs mindđ§” Itâs kinda fun bc when I do math my internal CoT is also bilingual, albeit a lot less funky lol. The modelâs CoT it makes a bit more sense if you know internet slangs & some Chinese culture
1
0
5
Applications close *today* for the Astra fellowship (for some applicants; others can apply til Oct 10). You should apply if you want to work with me or any of the other mentors in this program. As a Fellow working with me, you'd be involved with OP's technical AI safety
đONE WEEK LEFT to apply for an early decision for Astrađ If you need visa support to participate, or if youâve applied for @matsprogram, your application deadline for Astra is Sept 26th. âŹïžWe're also excited to announce new mentors across every stream! (1/4)
0
1
8
The Watchers DON'T want you to know this one simple trick to disclaim illusions
8
25
228
It's becoming increasingly clear that gpt5 can solve MINOR open math problems, those that would require a day/few days of a good PhD student. Ofc it's not a 100% guarantee, eg below gpt5 solves 3/5 optimization conjectures. Imo full impact of this has yet to be internalized...
134
283
2K
shelf of AI books, edited to have the titles that the cover image makes it look like they should
9
24
280
UKAISI has both exclusive-to-government access AND world-class jailbreaking researchers. Unique place to work for people interested these sorts of safeguards.
Excited to share details on two of our longest running and most effective safeguard collaborations, one with Anthropic and one with OpenAI. We've identifiedâand they've patchedâa large number of vulnerabilities and together strengthened their safeguards. đ§” 1/6
0
0
10
I will be blogging!
Introducing: Asterisk's AI Fellows. Hailing from Hawaii to Dubai, and many places between, our AI Fellows will be writing on law, military, development economics, evals, China, biosecurity, and much more. We canât wait to share their writing with you. https://t.co/rjLp2RAjME
1
1
69
Yep totally agreed with Ryan's goldilocks position here: small differences in chances in <2yr timelines are action relevant, big differences in chances of <10yr timelines are action-relevant, but other timelines differences are not
While I sometimes write about AGI timelines, I think moderate differences in timelines usually aren't very action relevant. Pretty short timelines (<10 years) seem likely enough to warrant strong action and it's hard to very confidently rule out things going crazy in <3 years.
0
0
5
This is a much more sensible way to conceptualize and evaluate CoT monitoring than the ways that dominate the discourse
The terms âCoTâ and reasoning trace make it sound like the CoT is a summary of an LLMâs reasoning. But IMO itâs more accurate to view CoT as a tool models use to think better. CoT monitoring is about tracking how models use this tool so we can glean insight into their
0
0
3
An interpretability method, if you can keep it!
Prior work has found that Chain of Thought (CoT) can be unfaithful. Should we then ignore what it says? In new research, we find that the CoT is informative about LLM cognition as long as the cognition is complex enough that it canât be performed in a single forward pass.
0
0
6
My god they've actually done it
Dario Amodei: "My friends, we have but two years to rigorously prepare the global community for the tumultuous arrival of AGI" Sam Altman: "we r gonna build a $55 trillion data center" Demis Hassabis: "I've created the worlds most accurate AI simulation of a Volcano."
13
25
1K
We at @AISecurityInst worked with @OpenAI to test & improve Agentâs safeguards prior to release. A few notes on our experienceđ§” 1/4
3
29
152