mealreplacer Profile Banner
Julian Profile
Julian

@mealreplacer

Followers
24K
Following
15K
Media
889
Statuses
6K

thinking about how to make AI go well @open_phil

New York, USA
Joined July 2022
Don't wanna be here? Send us removal request.
@mealreplacer
Julian
2 years
Stop scrolling!. You’ve been visited by esteemed philosopher Robert Long. Comment “good evening Robert” if you are currently having a subjective experience.
Tweet media one
111
6
231
@mealreplacer
Julian
14 days
RT @rgblong: big beautiful bill (will macaskill).
0
3
0
@mealreplacer
Julian
15 days
Here are a bunch of important caveats.
Tweet media one
1
0
6
@mealreplacer
Julian
15 days
AI economic impacts tracker: Study how AI is actually transforming work, not just benchmark scores. Survey managers, track productivity changes, investigate hiring impacts. We need to understand the gap between "AI passes the bar exam" and "AI changes paralegal jobs.".
1
0
5
@mealreplacer
Julian
15 days
AI auditors: Develop AI agents that can conduct compliance audits at labs. This is pretty speculative/high risk — the security risks of using AIs for this could be pretty severe — but human audits are expensive and face trust problems.
2
0
3
@mealreplacer
Julian
15 days
AI tools for fact-checking: Build transparent, demonstrably unbiased fact-checking tools. Better epistemics will be critical when AI causes rapid societal change.
1
1
6
@mealreplacer
Julian
15 days
$10 billion AI resilience plan: Create a detailed, implementation-ready blueprint for deploying massive funding towards AI safety. Going from "we should spend more" to "here's exactly how" is non-obvious, but helpful.
2
0
9
@mealreplacer
Julian
15 days
AI safety living literature reviews: Continuously updated expert syntheses on core topics. What's the evidence for scheming? Which safety agendas show progress? The field would benefit from accurate, high-quality synthesis of the weekly deluge of papers and hot takes.
1
1
7
@mealreplacer
Julian
15 days
AI lab monitor: Track and analyze frontier labs' safety practices in meticulous detail. Are they following their commitments? What safeguards are being implemented? External accountability matters when these companies are making world-changing decisions.
4
0
4
@mealreplacer
Julian
15 days
AI-safety-focused communications consultancy: Help AI safety people communicate more effectively. Most researchers don't know how to pitch op-eds or do interviews. A specialized firm with AI context could fill this gap (in a way that non-specialist comms firms aren’t as suited).
1
0
5
@mealreplacer
Julian
15 days
Tracking sketchy AI agent behaviour in the wild: Investigate deployed AI agents to see if they're misaligned. Create honeypots, analyze interaction logs, document concerning patterns. I’m a fan of in vitro studies of misalignment, but in vivo studies are great too.
1
0
8
@mealreplacer
Julian
15 days
Technical AI governance research: Start an org focused on governance-relevant technical questions like "what techniques can allow for verifiable model auditing without compromising model security?". This kind of analysis can help make a number of policy proposals more feasible.
1
0
6
@mealreplacer
Julian
15 days
AI security field-building: Run a program teaching security engineers about the types of security challenges most relevant to transformative AI, paired with practical advice on how to break into the field.
1
0
4
@mealreplacer
Julian
15 days
🆕 blog post!. My job involves funding projects aimed at preventing catastrophic risks from transformative AI. Over the two years I’ve been doing this, I’ve noticed a number of projects that I wish more people would work on. So here’s my attempt at fleshing out ten of them. 🧵
Tweet media one
3
5
43
@mealreplacer
Julian
2 months
As Leopold Aschenbrenner aptly put it: "On the current course, the leading Chinese AGI labs won't be in Beijing or Shanghai—they'll be in San Francisco and London." . We need to fix this before the models get even more capable. 8/8.
1
0
7
@mealreplacer
Julian
2 months
A silver lining, however, is that a lot of people are into AI security. In the AI policy world, consensus is somewhat rare, so maybe we can actually fix this. 7/8
Tweet media one
1
0
3
@mealreplacer
Julian
2 months
Worse, security levels at AI labs are not even close to defending against serious nation-state operations. RAND calls the needed level "SL5" – no lab is there yet, and they probably won't get there soon even if they try. 6/8.
1
0
1
@mealreplacer
Julian
2 months
One paper showed that fine-tuning GPT-3.5 with 10 harmful examples could "undermine its safety guardrails”. This is just one paper, and GPT-3.5 is not capable enough to be scary. But the asymmetry remains: undoing safety takes a fraction of the resources needed to create it. 5/8
Tweet media one
1
0
1