ch402 Profile Banner
Chris Olah Profile
Chris Olah

@ch402

Followers
118K
Following
11K
Media
455
Statuses
5K

Reverse engineering neural networks at @AnthropicAI. Previously @distillpub, OpenAI Clarity Team, Google Brain. Personal account.

San Francisco, CA
Joined June 2010
Don't wanna be here? Send us removal request.
@ch402
Chris Olah
6 days
RT @janleike: If you don't train your CoTs to look nice, you could get some safety from monitoring them. This seems good to do!. But I'm s….
0
15
0
@ch402
Chris Olah
2 months
RT @BarackObama: At a time when people are understandably focused on the daily chaos in Washington, these articles describe the rapidly acc….
Tweet media one
www.axios.com
Hardly anyone is paying attention.
0
9K
0
@ch402
Chris Olah
2 months
RT @AnthropicAI: Our interpretability team recently released research that traced the thoughts of a large language model. Now we’re open-s….
0
582
0
@ch402
Chris Olah
2 months
RT @michaelwhanna: @mntssys and I are excited to announce circuit-tracer, a library that makes circuit-finding simple!. Just type in a sent….
0
46
0
@ch402
Chris Olah
2 months
The stakes are high and time is short.
1
0
40
@ch402
Chris Olah
2 months
Of course, there are many brilliant people in AI safety. But at least for myself, there are clearly many people in math and the sciences who are much smarter than I, and I assume would do a much better job than me.
4
2
34
@ch402
Chris Olah
2 months
I often feel like, in some sense, humanity is failing to bring its intellectual weight to bear on AI safety, and this is a grave failure.
9
6
107
@ch402
Chris Olah
2 months
I admire Daniel's courage in his convictions. I'm excited to see what he (and Timaeus) do.
@danielmurfet
Daniel Murfet
2 months
A few months ago I resigned from my tenured position at the University of Melbourne and joined Timaeus as Director of Research. Timaeus is an AI safety non-profit research organisation. [1/n]🧵.
3
6
109
@ch402
Chris Olah
2 months
RT @sleepinyourhat: 🧵✨🙏 With the new Claude Opus 4, we conducted what I think is by far the most thorough pre-launch alignment assessment t….
0
165
0
@ch402
Chris Olah
2 months
RT @michael_nielsen: My three-sentence summary of Lakatos's "Proofs and Refutations", with apologies to Don Knuth: . "Premature definitio….
0
16
0
@ch402
Chris Olah
2 months
I should also mention that I wrote a blog post listing a bunch of specific analogies between deep learning and biology several years back. (It's probably of much narrower interest!).
colah.github.io
A list of advantages that make understanding artificial nerural networks much easier than biological ones.
6
4
108
@ch402
Chris Olah
2 months
Of course, I'd be remiss to not mention that many others have made analogies between work in machine learning and biology -- most notable for us is the "bertology" work, which framed it self as studying the biology of the BERT models.
1
0
43
@ch402
Chris Olah
2 months
But we also think it's important for such "biology" results (which are more foreign in style to machine learning) to be treated as worthy of publication independent of methods work (which looks more similar to normal machine learning).
1
0
43
@ch402
Chris Olah
2 months
This was partly a convenient way to handle the length (jointly, the two papers are ~150 pages!).
1
0
28
@ch402
Chris Olah
2 months
But why did the language come up in our paper title? There was actually a further reason, which is that we wanted to separate our "methods" work and what we called our "biology" work (i.e. the empirical research we did using our method).
1
0
43
@ch402
Chris Olah
2 months
Finally, you need to believe that a worthy mode of investigation is empirical (rather than theoretical), and a style of empirical research that's more open to the qualitative than purely quantitative. This evokes biology more than physics.
2
1
72
@ch402
Chris Olah
2 months
One further needs to believe that individual neural networks, and in fact sub-components of those networks, warrant investigation. That's more idiosyncratic!.
1
0
43
@ch402
Chris Olah
2 months
At a basic level, one needs to believe deep learning warrants scientific investigation. This doesn't seem very controversial these days, but note that it's already kind of radical. See eg. Herbert Simon's The Sciences of the Artificial.
1
0
54
@ch402
Chris Olah
2 months
I've written multiple papers characterizing (small sets of) individual neurons. Historically, this hasn't seemed like a worthy topic of a paper in ML – I've had to justify it!.
1
0
45
@ch402
Chris Olah
2 months
One way in which this is important is that the *types of questions* we're interested in are quite bizarre from a traditional machine learning perspective, but natural under the biological frame.
1
0
56