danielmurfet Profile Banner
Daniel Murfet Profile
Daniel Murfet

@danielmurfet

Followers
1K
Following
2K
Media
61
Statuses
4K

Mathematician. Head of Research at Timaeus. Working on Singular Learning Theory and AI alignment.

Melbourne, Victoria
Joined June 2012
Don't wanna be here? Send us removal request.
@danielmurfet
Daniel Murfet
5 hours
RT @MatthewFdashR: At least for me, the big-picture motivation behind our RLC paper is a research vision for scalable AI alignment via mini….
0
8
0
@danielmurfet
Daniel Murfet
2 days
RT @plain_simon: @aryaman2020 @cloneofsimo Yes and there are good reasons to expect the functions we want to learn have this property and t….
0
1
0
@danielmurfet
Daniel Murfet
2 days
RT @_arohan_: An explanation of Match3 functions and motivation for 2-simplicial attention. You have a shelf of about 30 film which includ….
0
12
0
@danielmurfet
Daniel Murfet
3 days
RT @JJitsev: Using scaling laws to compare standard pairwise and higher order triple interaction attention is nice. However, I think in cur….
0
2
0
@danielmurfet
Daniel Murfet
4 days
RT @AnimaAnandkumar: I have been advocating tensor methods for almost decade and a half. Take a look at our tensor methods in deep learning….
0
50
0
@danielmurfet
Daniel Murfet
7 days
RT @nsaphra: 🚨 New preprint! 🚨 Phase transitions! We love to see them during LM training. Syntactic attention structure, induction heads, g….
0
43
0
@danielmurfet
Daniel Murfet
12 days
RT @l_spermatikos: It's strangely satisfying how knot theory has looped back around on itself. It was Lord Kelvin's proposal that atoms wer….
0
62
0
@danielmurfet
Daniel Murfet
12 days
RT @Turn_Trout: The "sleeper agent" terminology is hyperbolic and unfortunate IMO. Crying wolf. Should have reserved such an aggressive tit….
0
4
0
@danielmurfet
Daniel Murfet
15 days
RT @bschne: what the. 🤯
Tweet media one
0
127
0
@danielmurfet
Daniel Murfet
18 days
RT @plain_simon: Great discussion on the big picture of singular learning theory and its applications to interpretability of neural network….
0
3
0
@danielmurfet
Daniel Murfet
19 days
RT @baophamhq: Diffusion models create novel images, but they can also memorize samples from the training set. How do they blend stored fea….
0
85
0
@danielmurfet
Daniel Murfet
21 days
RT @plain_simon: Last week I gave a talk in the. Belgian-Dutch Junior algebraic geometry seminar. on the real log-canonical threshold in s….
0
2
0
@danielmurfet
Daniel Murfet
21 days
RT @geoffreyirving: New alignment theory paper! We present a new scalable oversight protocol (prover-estimator debate) and a proof that hon….
0
55
0
@danielmurfet
Daniel Murfet
22 days
RT @neuro_kim: New work on relational reasoning in transformers!. TLDR: In-Weight and In-Context Learning inductive baises are really diffe….
0
4
0
@danielmurfet
Daniel Murfet
24 days
As far as I know this doesn't yet have a very strong theoretical basis (we are working on it). I think sloppy reasoning about simplicity biases is having a kind of hidden systematic influence on people's thinking about various approaches to alignment.
0
0
8
@danielmurfet
Daniel Murfet
24 days
However it isn't hard to imagine than in a multi-agent system with a long-horizon task, actually \pi is simpler because it is more robust to interventions by other agents. The savant who sits in the corner and builds a tower (this is X) can be bullied.
1
0
6
@danielmurfet
Daniel Murfet
24 days
A common factor in many people's thoughts about alignment is the idea that a policy \pi which does task X and also seeks influence is more complex than a policy \psi which just tries to do X. Thus if we "keep the optimisation pressure on" we'll discourage influence-seeking. 🧵.
2
1
12
@danielmurfet
Daniel Murfet
25 days
RT @DavidDuvenaud: What to do about gradual disempowerment? We laid out a research agenda with all the concrete and feasible research proje….
0
37
0
@danielmurfet
Daniel Murfet
26 days
RT @ChrSzegedy: Why is verified superintelligence the next logical step?. What problems will it solve and how?. 🧵 1/7.
0
36
0
@danielmurfet
Daniel Murfet
26 days
RT @SimonShaoleiDu: EM is a classic, but it can fail even for 3-component Gaussian mixtures. Why is EM wide used? Over-parameterization!.We….
0
36
0