Daniel Murfet @danielmurfet X Profile

Daniel Murfet

@danielmurfet

Followers

1K

Following

2K

Media

61

Statuses

4K

Mathematician. Head of Research at Timaeus. Working on Singular Learning Theory and AI alignment.

Melbourne, Victoria

Joined June 2012

Don't wanna be here? Send us removal request.

Daniel Murfet

@danielmurfet

8 hours

RT @MatthewFdashR: At least for me, the big-picture motivation behind our RLC paper is a research vision for scalable AI alignment via mini….

0

8

0

Daniel Murfet

@danielmurfet

2 days

RT @plain_simon: @aryaman2020 @cloneofsimo Yes and there are good reasons to expect the functions we want to learn have this property and t….

0

1

0

Daniel Murfet

@danielmurfet

2 days

RT @_arohan_: An explanation of Match3 functions and motivation for 2-simplicial attention. You have a shelf of about 30 film which includ….

0

12

0

Daniel Murfet

@danielmurfet

3 days

RT @JJitsev: Using scaling laws to compare standard pairwise and higher order triple interaction attention is nice. However, I think in cur….

0

2

0

Daniel Murfet

@danielmurfet

4 days

RT @AnimaAnandkumar: I have been advocating tensor methods for almost decade and a half. Take a look at our tensor methods in deep learning….

0

50

0

Daniel Murfet

@danielmurfet

7 days

RT @nsaphra: 🚨 New preprint! 🚨 Phase transitions! We love to see them during LM training. Syntactic attention structure, induction heads, g….

0

43

0

Daniel Murfet

@danielmurfet

12 days

RT @l_spermatikos: It's strangely satisfying how knot theory has looped back around on itself. It was Lord Kelvin's proposal that atoms wer….

0

63

0

Daniel Murfet

@danielmurfet

12 days

RT @Turn_Trout: The "sleeper agent" terminology is hyperbolic and unfortunate IMO. Crying wolf. Should have reserved such an aggressive tit….

0

4

0

Daniel Murfet

@danielmurfet

15 days

RT @bschne: what the. 🤯

0

127

0

Daniel Murfet

@danielmurfet

18 days

RT @plain_simon: Great discussion on the big picture of singular learning theory and its applications to interpretability of neural network….

0

3

0

Daniel Murfet

@danielmurfet

19 days

RT @baophamhq: Diffusion models create novel images, but they can also memorize samples from the training set. How do they blend stored fea….

0

85

0

Daniel Murfet

@danielmurfet

21 days

RT @plain_simon: Last week I gave a talk in the. Belgian-Dutch Junior algebraic geometry seminar. on the real log-canonical threshold in s….

0

2

0

Daniel Murfet

@danielmurfet

21 days

RT @geoffreyirving: New alignment theory paper! We present a new scalable oversight protocol (prover-estimator debate) and a proof that hon….

0

55

0

Daniel Murfet

@danielmurfet

22 days

RT @neuro_kim: New work on relational reasoning in transformers!. TLDR: In-Weight and In-Context Learning inductive baises are really diffe….

0

4

0

Daniel Murfet

@danielmurfet

24 days

As far as I know this doesn't yet have a very strong theoretical basis (we are working on it). I think sloppy reasoning about simplicity biases is having a kind of hidden systematic influence on people's thinking about various approaches to alignment.

0

8

Daniel Murfet

@danielmurfet

24 days

However it isn't hard to imagine than in a multi-agent system with a long-horizon task, actually \pi is simpler because it is more robust to interventions by other agents. The savant who sits in the corner and builds a tower (this is X) can be bullied.

1

0

6

Daniel Murfet

@danielmurfet

24 days

A common factor in many people's thoughts about alignment is the idea that a policy \pi which does task X and also seeks influence is more complex than a policy \psi which just tries to do X. Thus if we "keep the optimisation pressure on" we'll discourage influence-seeking. 🧵.

2

1

12

Daniel Murfet

@danielmurfet

25 days

RT @DavidDuvenaud: What to do about gradual disempowerment? We laid out a research agenda with all the concrete and feasible research proje….

0

37

0

Daniel Murfet

@danielmurfet

26 days

RT @ChrSzegedy: Why is verified superintelligence the next logical step?. What problems will it solve and how?. 🧵 1/7.

0

36

0

Daniel Murfet

@danielmurfet

26 days

RT @SimonShaoleiDu: EM is a classic, but it can fail even for 3-component Gaussian mixtures. Why is EM wide used? Over-parameterization!.We….

0

36

0