danielmurfet Profile Banner
Daniel Murfet Profile
Daniel Murfet

@danielmurfet

Followers
1K
Following
2K
Media
70
Statuses
4K

Mathematician. Head of Research at Timaeus. Working on Singular Learning Theory and AI alignment.

Melbourne, Victoria
Joined June 2012
Don't wanna be here? Send us removal request.
@danielmurfet
Daniel Murfet
4 days
RT @mhutter42: Reflective-Oracle AIXI solves the Grain of Truth problem for super-intelligent multi-agent systems/societies. Finally the lo….
0
14
0
@danielmurfet
Daniel Murfet
7 days
RT @d_m_d_m_d_d: calculation of global sections of line bundles on projective varieties.
0
12
0
@grok
Grok
10 days
Join millions who have switched to Grok.
228
469
3K
@danielmurfet
Daniel Murfet
7 days
RT @banburismus_: post-training is weird, and can have all sorts of surprising side effects - extreme sycophancy, hallucinations, mechahitl….
0
4
0
@danielmurfet
Daniel Murfet
8 days
RT @gsxej: Neuronal diversity is written in transcriptional codes 🧬. But what is the logic of these codes that define cell types and wiring….
0
30
0
@danielmurfet
Daniel Murfet
8 days
RT @GoodfireAI: (6/7) Of course, a full solution also requires tools to mitigate those behaviors once they've been identified - and we're b….
0
1
0
@danielmurfet
Daniel Murfet
10 days
RT @jhhalverson: Grateful to @SimonsFdn for their support of the Physics of Learning, and glad to be a part of this collaboration! Excited….
0
1
0
@danielmurfet
Daniel Murfet
11 days
RT @pratyushmaini: 1/Pretraining is hitting a data wall; scaling raw web data alone leads to diminishing returns. Today @datologyai shares….
0
124
0
@danielmurfet
Daniel Murfet
16 days
RT @strickvl: In parallel I'd been exploring how to make LLMs tangible, i.e. as physical artifacts, not just plots. I started a small proje….
0
3
0
@danielmurfet
Daniel Murfet
16 days
RT @ch402: Our interpretability team is planning to mentor more fellows this cycle!. Applications are due Aug 17.
0
19
0
@danielmurfet
Daniel Murfet
17 days
A brief walkthrough:
2
1
13
@danielmurfet
Daniel Murfet
17 days
Mom: we have rainbow serpent at home. Rainbow serpent at home: We recently introduced an approach to interpretability for language models based on susceptibility UMAPs, and it's now available in a webapp for you to try (with some Pythia models too!)
Tweet media one
2
8
59
@danielmurfet
Daniel Murfet
19 days
RT @tfburns: Could the key to more efficient & robust language models come from computational neuroscience? Our paper demonstrates how brai….
0
2
0
@danielmurfet
Daniel Murfet
19 days
RT @ChrisGPotts: For a @GoodfireAI/@AnthropicAI meet-up later this month, I wrote a discussion doc:. Assessing skeptical views of interpret….
0
24
0
@danielmurfet
Daniel Murfet
22 days
RT @AsteraInstitute: What’s going on inside large AI models?. Astera grantees @adamimos and @RiechersPaul are building a new theory of inte….
0
4
0
@danielmurfet
Daniel Murfet
22 days
Tweet media one
1
1
12
@danielmurfet
Daniel Murfet
24 days
RT @LabWelch: Interested in studying cell differentiation at the cellular level but don't trust your UMAP plots? Try visualizing your cell….
0
30
0
@danielmurfet
Daniel Murfet
24 days
RT @thebasepoint: Very cool collaboration between 5 labs that dug into circuit tracing after our paper in March. Sections on replications,….
0
6
0
@danielmurfet
Daniel Murfet
24 days
This charming fellow is, however, too small to be really interesting. In larger models we see more complex structures, stay tuned! To read more: joint with @georgeyw_ @Gman5938 and Andy Gordon.
Tweet card summary image
arxiv.org
Understanding how language models develop their internal computational structure is a central problem in the science of deep learning. While susceptibilities, drawn from statistical physics, offer...
2
9
67
@danielmurfet
Daniel Murfet
24 days
Compared to math, experiments may gray the hair, but the eye candy is beyond compare: we nearly fell out of our chairs when the first UMAP plots of the rainbow serpent showed up. What’s kind of wild is that four training seeds look so similar (spot the difference).
Tweet media one
4
2
57