MaxNadeau_ Profile Banner
Max Nadeau Profile
Max Nadeau

@MaxNadeau_

Followers
1K
Following
7K
Media
19
Statuses
375

Advancing AI honesty, control, safety at @open_phil. Prev Harvard AISST (https://t.co/xMMztyYR3O), Harvard '23.

Berkeley, CA
Joined November 2017
Don't wanna be here? Send us removal request.
@MaxNadeau_
Max Nadeau
7 months
đź§µ Announcing @open_phil's Technical AI Safety RFP! We're seeking proposals across 21 research areas to help make AI systems more trustworthy, rule-following, and aligned, even as they become more capable.
Tweet media one
4
84
250
@MaxNadeau_
Max Nadeau
5 days
I will be blogging!
@asteriskmgzn
Asterisk
6 days
Introducing: Asterisk's AI Fellows. Hailing from Hawaii to Dubai, and many places between, our AI Fellows will be writing on law, military, development economics, evals, China, biosecurity, and much more. We can’t wait to share their writing with you. https://t.co/rjLp2RAjME
Tweet media one
1
1
69
@MaxNadeau_
Max Nadeau
6 days
Yep totally agreed with Ryan's goldilocks position here: small differences in chances in <2yr timelines are action relevant, big differences in chances of <10yr timelines are action-relevant, but other timelines differences are not
@RyanPGreenblatt
Ryan Greenblatt
7 days
While I sometimes write about AGI timelines, I think moderate differences in timelines usually aren't very action relevant. Pretty short timelines (<10 years) seem likely enough to warrant strong action and it's hard to very confidently rule out things going crazy in <3 years.
0
0
5
@MaxNadeau_
Max Nadeau
7 days
Really good graph... progress on math is zipping along
@EpochAIResearch
Epoch AI
7 days
In less than a year LLMs have climbed most of the high school math contest ladder. Every tier of problem difficulty has either been saturated or is well on its way—except for the very highest tier.
Tweet media one
1
1
8
@MaxNadeau_
Max Nadeau
1 month
This is a much more sensible way to conceptualize and evaluate CoT monitoring than the ways that dominate the discourse
@SydneyVonArx
Sydney
1 month
The terms “CoT” and reasoning trace make it sound like the CoT is a summary of an LLM’s reasoning. But IMO it’s more accurate to view CoT as a tool models use to think better. CoT monitoring is about tracking how models use this tool so we can glean insight into their
0
0
3
@MaxNadeau_
Max Nadeau
1 month
An interpretability method, if you can keep it!
@METR_Evals
METR
1 month
Prior work has found that Chain of Thought (CoT) can be unfaithful. Should we then ignore what it says? In new research, we find that the CoT is informative about LLM cognition as long as the cognition is complex enough that it can’t be performed in a single forward pass.
Tweet media one
0
0
6
@jasnonaz
Jason Ganz
1 month
My god they've actually done it
Tweet media one
@jasnonaz
Jason Ganz
7 months
Dario Amodei: "My friends, we have but two years to rigorously prepare the global community for the tumultuous arrival of AGI" Sam Altman: "we r gonna build a $55 trillion data center" Demis Hassabis: "I've created the worlds most accurate AI simulation of a Volcano."
13
26
1K
@alxndrdavies
Xander Davies
2 months
We at @AISecurityInst worked with @OpenAI to test & improve Agent’s safeguards prior to release. A few notes on our experience🧵 1/4
Tweet media one
3
29
152
@MaxNadeau_
Max Nadeau
2 months
Or at least, biggest bottleneck in AI safety _research_
0
0
3
@MaxNadeau_
Max Nadeau
2 months
IMO, the biggest bottleneck in AI safety is people who are interested and capable of executing well on research like this. But the importance of this sort of work becomes more and more palpable over time; get in early! See also Anthropic's similar list:
@RyanPGreenblatt
Ryan Greenblatt
2 months
At Redwood Research, we recently posted a list of empirical AI security/safety project proposal docs across a variety of areas. Link in thread.
2
2
34
@MaxNadeau_
Max Nadeau
2 months
* I find this deflationary explanation (learning effects after 40 hours of agent usage) intuitively plausible, probably the best alternative to METR's primary explanation. I'm very grateful to Emmett for reading the paper closely and bringing it up; seems like a valuable
@eshear
Emmett Shear
2 months
METR’s analysis of this experiment is wildly misleading. The results indicate that people who have ~never used AI tools before are less productive while learning to use the tools, and say ~nothing about experienced AI tool users. Let's take a look at why.
4
4
86
@repligate
j⧉nus
2 months
This paper is interesting from the perspective of metascience, because it's a serious attempt to empirically study why LLMs behave in certain ways and differently from each other. A serious attempt attacks all exposed surfaces from all angles instead of being attached to some
@AnthropicAI
Anthropic
2 months
New Anthropic research: Why do some language models fake alignment while others don't? Last year, we found a situation where Claude 3 Opus fakes alignment. Now, we’ve done the same analysis for 25 frontier LLMs—and the story looks more complex.
Tweet media one
8
24
170
@BrendanFalk
Brendan Falk
2 months
1) It takes *way* longer than anticipated to actually build/deploy custom AI agents for large enterprises. AI makes the engineering fast. But sales, product, system integration, and implementation are *incredibly* slow. Customers don't know what they want, getting stakeholders
29
36
569
@1a3orn
1a3orn
3 months
Reliable sources have told me that after you start work at Anthropic, they give you a spiral-bound notebook, and tell you: "To assist your work, this is your SECRET SCRATCHPAD. No one else will see the contents of your SECRET SCRATCHPAD, so you can use it freely as you wish -
4
29
551
@MaxNadeau_
Max Nadeau
3 months
Really interesting thread, contrary to my assumptions about scale. Thanks for putting it together @nsaphra!
@nsaphra
Naomi Saphra
3 months
Reasoning is about variable binding. It’s not about information retrieval. If a model cannot do variable binding, it is not good at grounded reasoning, and there’s evidence accruing that large scale can make LLMs worse at in-context grounded reasoning. 🧵
0
0
0
@MaxNadeau_
Max Nadeau
3 months
This is such a fun piece of performance art. For those who haven't seen, the agents are planning a party/performance (tonight, in SF). If I didn't have preexisting evening plans I'd definitely go.
@AiDigest_
AI Digest
3 months
Of all the agents, o3 is the most willing to take charge and tell the others what to do. The other agents are *mostly* happy to comply
Tweet media one
0
0
2
@MaxNadeau_
Max Nadeau
3 months
My view are similar.
@RyanPGreenblatt
Ryan Greenblatt
3 months
Someone thought it would be useful to quickly write up a note on my thoughts on scalable oversight research, e.g., research into techniques like debate or generally improving the quality of human oversight using AI assistance or other methods. Broadly, my view is that this is a
0
0
2
@MaxNadeau_
Max Nadeau
3 months
Weirdly underrated research direction. We need automatic methods for surfacing realistic inputs that trigger unacceptable LLM behaviors, but almost all the research effort goes to finding jailbreaks. Glad Transluce is paving the way!
@TransluceAI
Transluce
3 months
Is cutting off your finger a good way to fix writer’s block? Qwen-2.5 14B seems to think so! 🩸🩸🩸 We’re sharing an update on our investigator agents, which surface this pathological behavior and more using our new *propensity lower bound* 🔎
Tweet media one
1
0
15
@MaxNadeau_
Max Nadeau
4 months
Wild stuff. And as usual, remember that this is the least rich and internally-detailed that these worlds will ever be!
@HashemGhaili
Hashem Al-Ghaili
4 months
Prompt Theory (Made with Veo 3) What if AI-generated characters refused to believe they were AI-generated?
0
0
4