Alec Dewulf
@AlecDewulf
Followers
58
Following
107
Media
0
Statuses
15
Today we're very happy to announce that we’re launching the Tilde Fellowship Program to support research in a mechanistic understanding of pre-training science (arch, optimizers, learning dynamics, etc.). Much of modern ML progress has come from scaling models and empirically
2
12
132
A nice general framework for understanding the recent Manifold Muon and designing optimizers on other manifolds from @SolidlySheafy
Modern optimizers can struggle with unstable training. Building off of Manifold Muon, we explore more lenient mechanisms for constraining the geometry of a neural network's weights directly through their Gram matrix 🧠 A 🧵… ~1/6~
0
0
0
Introducing Tinker: a flexible API for fine-tuning language models. Write training loops in Python on your laptop; we'll run them on distributed GPUs. Private beta starts today. We can't wait to see what researchers and developers build with cutting-edge open models!
226
792
6K
This. These guys cook.
The upcoming arrow redesign will focus on diagramming (connecting shape -> shape). In cases where you're allowed to go inside, there's a timeout so you can still easily bind to the shape outline, but also bind inside it. (You'll always be able to bind inside any time by using
0
1
3
Check out @nathancgy4's awesome Deltaformer PR and stay tuned for a post on the architecture soon!
SEED's paper on associative memory and DeltaFormer is still one of my favorites 🎉so I'm happy share that DeltaFormer is now supported on FLA (flash linear attention)! Learned incredibly much from @yzhang_cs and Mingyu
0
2
21
Have really enjoyed learning from Alec, hope people like the post!
Vignette #2 is here! Join @AlecDewulf to: Learn about circuit complexity theory Derive theoretical capabilities and limitations of transformers Discuss the future of theoretical computer science in architecture design A thread 🧵
0
1
5
Vignette #2 is here! Join @AlecDewulf to: Learn about circuit complexity theory Derive theoretical capabilities and limitations of transformers Discuss the future of theoretical computer science in architecture design A thread 🧵
1
5
22
We'll be at the Berkeley EECS Career Fair this Tuesday & Wednesday with cool custom puzzles (and prizes). @berkeley_ai @UCBerkeley
0
3
16
Surface of Mars captured by the Curiosity rover 🔴 It's 🤯 , I'm looking at another planet. Millions of miles away. In space. I am looking at the surface of a planet people only managed to peek a few hundred years ago. The world is tiny compared to what's out there. @elonmusk
3K
24K
129K