Ed Turner Profile
Ed Turner

@EdTurner42

Followers
223
Following
47
Media
5
Statuses
16

ex-quant, working on mech-int, interested in ML (meta-learning)

Joined March 2025
Don't wanna be here? Send us removal request.
@EdTurner42
Ed Turner
19 days
RT @NeelNanda5: Really awesome to see Ed and Anna's work on emergent misalignment covered in MIT Tech Review, alongside OpenAI's great new….
0
12
0
@EdTurner42
Ed Turner
20 days
RT @NeelNanda5: Oh, and my favourite part of this project is that Ed and Anna found the core results in a two week sprint!.
0
3
0
@EdTurner42
Ed Turner
21 days
RT @NeelNanda5: Excited to have supervised these papers! EM was wild, with unclear implications for safety. We answer how: there's a genera….
0
15
0
@EdTurner42
Ed Turner
21 days
8/8: This work was authored by myself, @anna_soligo, Mia Taylor, @sen_r and @NeelNanda5. We would like to thank @calsmcdougall, @DanielCHTan97, @TwmStone and @timwyse for valuable feedback and discussions throughout. The work was supported by MATS and a grant from Open.
1
1
25
@EdTurner42
Ed Turner
21 days
7/8: To read more:. Model Organisms for Emergent Misalignment.Paper: Blogpost: Convergent Linear Representations of Emergent Misalignment.Paper: Blogpost:
1
1
26
@EdTurner42
Ed Turner
21 days
6/8: We open source all of our fine-tuned models here:. With all the corresponding datasets and code here:.
1
1
21
@EdTurner42
Ed Turner
21 days
5/8: Using the minimal architecture we discover a mechanistic phase transition, at this point there is a rapid rotation of the learnt directions. Before the rotation scaling gives no EM, but afterwards, scaling the LoRA adapters gives an 'induced behavioural phase transition'.
Tweet media one
1
0
24
@EdTurner42
Ed Turner
21 days
4/8: In our paper “Organisms of Emergent Misalignment” we train on 3 novel datasets and show EM happens across various model families & sizes. We show it occurs with a single rank-1 LoRA, which isolates the misalignment-inducing change we want to study.
Tweet media one
2
1
16
@EdTurner42
Ed Turner
21 days
3/8: We present a set of methods to directly interpret the LoRA adapters learnt in fine-tuning. We find some correspond to general misalignment, but a subset of adapters specifically control narrow misalignment in the fine tuning context, in this case 'bad medical advice'.
Tweet media one
1
0
24
@EdTurner42
Ed Turner
21 days
2/8: In our paper “Convergent Linear Representations of Emergent Misalignment” we find a general ‘misalignment’ vector! This drives all bad behaviour, from insecure code to bad medical advice. We find steering the single vector can induce EM, and ablating it substantially reduces
Tweet media one
1
1
27
@EdTurner42
Ed Turner
21 days
1/8: The Emergent Misalignment paper showed LLMs trained on insecure code then want to enslave humanity. ?!. We're releasing two papers exploring why! We:.- Open source small clean EM models.- Show EM is driven by a single evil vector.- Show EM has a mechanistic phase transition
Tweet media one
16
44
236
@EdTurner42
Ed Turner
1 month
RT @trishume: Anthropic is hosting a recruiting social in NYC targeted at the quant trading industry! Signup in thread. I enjoyed trading….
0
35
0
@EdTurner42
Ed Turner
3 months
please let the record state I quit before sama spoke ….
@sama
Sam Altman
3 months
>be you.>work in HFT shaving nanoseconds off latency or extracting bps from models.>have existential dread.>see this tweet, wonder if your skills could be better used making AGI.>apply to attend this party, meet the openai team.>build AGI.
0
0
0
@EdTurner42
Ed Turner
4 months
RT @TheZvi: It's that time again. AI #108: Straight Line on a Graph.
0
2
0