nathom
@realnathom
Followers
227
Following
12K
Media
50
Statuses
1K
computer programmer 🇺🇸🇻🇦
π*
Joined September 2015
As part of our recent work on memory layer architectures, I wrote up some of my thoughts on the continual learning problem broadly: Blog post: https://t.co/HNLqfNsQfN Some of the exposition goes beyond mem layers, so I thought it'd be useful to highlight separately:
26
166
1K
We have no theory of intelligence. No amount of gestures to "language as universal interface" or "straight line on a log plot" will change that. There are certain people who will speak with the tone and authority that *implies* the existence of such a theory. You know them all
OF WINTER BUTTERFLIES AND MOLTEN ROCKS A SCENT OF APOCALYPSE IN MY WAKE MY EYES ARE YOURS LOOK THROUGH ME AND SEE YOUR OWN SELF DYING I AM TIME HOLDING YOU IN TENDER EMBRACE AS WE FALL FOREVER IN LOVE AND DEATH UNENDING —Llama-3-8B
29
10
281
Why is there so much padding in technical documentation?
1
0
1
We need to solve latent CoT
AI models “think” in two ways: - in the latent space over layers - in the token space over a sequence Latent space = natural talent, chain of thought = hard work. Just like for humans, hard work can get you far, but talent sets the ceiling. This is why pretraining can’t die.
0
0
0
@aidan_mclau this method of evaluating difficulty of tasks for AI using the metric of "cognitive horsepower" has a very bad track record see below for an explanation of why this is the case
5
3
127
New paper & surprising result. LLMs transmit traits to other models via hidden signals in data. Datasets consisting only of 3-digit numbers can transmit a love for owls, or evil tendencies. 🧵
290
1K
9K
pro tip: leaking memory will actually improve program shutdown times
0
0
3
If you want to know exactly what numbers Infinity and NaN are, Haskell got your back:
15
6
153
the x86 CPU driving 8 B200s
14
44
1K