
Eric J. Michaud
@ericjmichaud_
Followers
3K
Following
3K
Media
46
Statuses
271
PhD student at MIT. Trying to make deep neural networks among the best understood objects in the universe. ๐ป๐ค๐ง ๐ฝ๐ญ๐
SF
Joined February 2015
Understanding the origin of neural scaling laws and the emergence of new capabilities with scale is key to understanding what deep neural networks are learning. In our new paper, @tegmark, @ZimingLiu11, @uzpg_ and I develop a theory of neural scaling. ๐งต:.
arxiv.org
We propose the Quantization Model of neural scaling laws, explaining both the observed power law dropoff of loss with model and data size, and also the sudden emergence of new capabilities with...
5
44
231
I've moved to SF and am working at @GoodfireAI this summer! Excited to be here and to spend time with many friends, old and new.
18
5
401
RT @asher5772: It's been a pleasure working on this with @ericjmichaud_ and @tegmark, and I'm excited that it's finally out!. In this work,โฆ.
0
3
0
CMSP is also a nice generalization of the multitask sparse parity task we studied in our "quanta hypothesis" neural scaling work: There, tasks were independent, but our tasks here have a kind of hierarchical dependence structure like a "skill tree".
arxiv.org
We propose the Quantization Model of neural scaling laws, explaining both the observed power law dropoff of loss with model and data size, and also the sudden emergence of new capabilities with...
1
0
16
Before jumping in, here is the arxiv link:. Many thanks to co-authors @asher5772 and @tegmark :).
arxiv.org
We study the problem of creating strong, yet narrow, AI systems. While recent AI progress has been driven by the training of large general-purpose foundation models, the creation of smaller models...
1
1
38