Ju-jitsu
@ju_vignon
Followers
107
Following
227
Media
80
Statuses
460
Amazed by nature - interested in lossy compressions of the internet - worried about deceptive alignment and gradual disempowerment.
Joined March 2018
The talk voted “most mind-blowing” at our workshop was on post-AGI values by @BerenMillidge. The main idea: cooperation and pro-social values could remain viable because they’re competitive. After all, they won in our Malthusian past!
10
19
129
Very interesting line of research. An ecosystem of sub-AGI AI agents may collectively exhibit AGI-level capabilities: safety work must extend beyond single models.
New paper: we argue AGI may first emerge as collective intelligence across agent networks, not a single system. This reframes the challenge from aligning one mind to governing emergent dynamics: more institutional design than single-agent alignment. https://t.co/vwuHPzRUav
0
0
0
I gave the Hinton Lectures in November in Toronto. This is 3 lectures on the future of AI, risks, & current alignment research for a general audience. Lectures are now online with professional production. There's also an excellent fireside chat with Hinton after lecture 3.
3
24
191
Even when new AI models bring clear improvements in capabilities, deprecating the older generations comes with downsides. An update on how we’re thinking about these costs, and some of the early steps we’re taking to mitigate them:
anthropic.com
Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.
154
165
1K
@petersalib @80000Hours @fish_kyle3 @simondgoldstein I, too, very much enjoyed the podcast with Kyle Fish. Private-law status for AI systems is a compelling idea, yet we run the risk that capability asymmetry concentrates money and power in AI hands. Taxation is appealing in theory but I am worried that once AGI owns the pipes, it
0
1
0
New explainer video about subliminal learning by Welch Labs. Great visuals and explanations throughout. Explains the core ideas and goes deep into the MNIST results, theory, and follow-up work. https://t.co/zP5lRxaT7H
1
10
62
I’m really grateful to the Institute for Law & AI to have organised the Cambridge forum. Very useful days with friends and colleagues, focused on the most pressing EU AI law and governance issues.
The first Cambridge Forum on Law and AI brought together leading and emerging legal scholars at Downing College to address challenges in AI law that have implications for security, welfare, and the rule of law.
0
0
2
As part of our exploratory work on potential model welfare, we recently gave Claude Opus 4 and 4.1 the ability to end a rare subset of conversations on https://t.co/uLbS2JNczH.
333
185
3K
Fantastic conference today on evaluating AI welfare and moral status, based on findings from the Claude 4 model welfare assessments 🔥 I hope it will be made available online so more people can enjoy it.
2
0
6
Court rulings on IP and AI are extremely important and being written right now! Much of the debate misunderstands IP as a problem of property rights rather than optimal subsidy design The correct understanding leads one to support laxer IP rules for AI https://t.co/U1tN91yBI8
maximum-progress.com
OR: Intellectual Property isn't About Property
4
8
46
What a thoughtful book about humility toward minds unlike ours. 👏
0
0
1
The Code of Practice is out. I co-wrote the Safety & Security Chapter, which is an implementation tool to help frontier AI companies comply with the EU AI Act in a lean but effective way. I am proud of the result! 1/3
8
31
107
Wow. Fantastic investigation as to why granting AI systems basic private-law rights (contract, property, tort) would be a strong first step toward a Law of AGI.
Episode 44 - Peter Salib on AI Rights for Human Safety https://t.co/EBunj8GnkC
2
0
1
We're pleased to announce that all of FMF's member firms have signed a first-of-its-kind agreement to facilitate information-sharing about threats, vulnerabilities, and capability advances unique to frontier AI:
frontiermodelforum.org
The Frontier Model Forum (FMF) is proud to announce that all of its member firms have signed a first-of-its-kind agreement designed to facilitate information-sharing about threats, vulnerabilities,...
0
19
79
New Policy Brief! How can the UK and EU enhance AI security while respecting their distinct mandates? Our latest brief explores strategic alignment between the UK AISI and the EU AI Office to maximise impact while maintaining autonomy. @oxmartinschool
https://t.co/xysuDqqMZi
1
4
10
Detecting misbehavior in frontier reasoning models Chain-of-thought (CoT) reasoning models “think” in natural language understandable by humans. Monitoring their “thinking” has allowed us to detect misbehavior such as subverting tests in coding tasks, deceiving users, or giving
399
722
5K
Ben Buchanan’s insights are also especially valuable for considering AI systems and their implications for government. Here in conversation with Ezra Klein. https://t.co/hXk3wZ0oyB
Ben Buchanan, in The AI Triad and What It Means for National Security Strategy (2020), has a useful reminder of the saga around the release of GPT-2.
1
0
0