⿻ Andrew Trask
@iamtrask
Followers
79K
Following
13K
Media
63
Statuses
2K
i teach AI on X Building @openminedorg, @GoogleDeepMind, @OxfordUni. Also @UN @GovAI_ @CFR_org. I like to train federated/decentralized neural nets.
Oxford, UK
Joined November 2012
If writing technical blogs/tutorials on AI (decentralized / federated / privacy-preserving / etc.) that get on #HackerNews / #Reddit / X sounds like a fun day job... DM me. (I would mentor you.)
6
6
50
Thank you to the researchers at @a16z and @EpochAIResearch whose excellent reports underpin this piece in important ways.
0
0
9
Sleigh the season with the most personal gift around. Get them a Cameo video!
6
13
93
I've just drafted a new blogpost "GPU demand is (~1Mx) distorted by efficiency problems which are being solved" Mid-2024, Andrej Karpathy trained GPT-2 for $20. Six months later, Andreessen Horowitz reported LLM costs falling 10x annually. Two months after that, DeepSeek
30
38
349
(also to be fair to Dario... we did take advantage of the moment to get a bit of press https://t.co/6yEq4gzhKk)
spectrum.ieee.org
Digital Reasoning has trained a record-breaking artificial intelligence neural network that is 14 times larger than Google's previous record
0
0
4
(we also blew away SOTA that year... so accuracy did in fact count)
1
0
6
President Trump says it perfectly. A patchwork of 50 different state systems creates a maze of conflicting regulations, resulting in chaos. Follow me to join the conversation on leading the AI revolution.
188
283
2K
Actually what he really said was closer to... "anyone can load a bunch of weights into memory... accuracy is what counts"
1
0
10
~10 years ago I trained a 160 billion parameter LLM and we published it as our first ICML paper I got to have drinks with the chair of ICML that year, and I remember bragging about size... he quickly corrected me "Size doesn't matter... accuracy does" This reminded me of that
"100 million words context window is already possible, which is roughly what a human hears in a lifetime. Inference support is the only bottleneck to achieve it. And AI Models actually do learn during the context window, without changing the weights." ~ Anthropic CEO Dario
22
13
229
I've just drafted a new blogpost "The Bitter Lesson's Bitter Lesson" Richard Sutton and Dwarkesh discussed the Bitter Lesson, where Richard argued that babies and animals don’t learn through imitation, so state-of-the-art LLMs are pursuing the wrong path by imitating humans
Dwarkesh and I had a frank exchange of views. I hope we moved the conversation forward. Dwarkesh is a true gentleman.
77
103
977
For example, If you have dots on an XY-plane, and you fit a model to them... drawing a line between the dots... you can then use that model to generate more dots which didn't exist before. And if you train a model on that new line + dots, the model can be *smaller* and *more
6
0
13
Refinement: synthetic data is a compressed version of the original training data. When synthetic data does a good job, that lossy compression removes noise. But the synthetic data isn't "smarter" than the original. It's just easier to train on because it's better
8
1
19
IMO — biggest misunderstanding in AI right now... Synthetic data isn't synthetic data. It's a cleaner version of the original data. It was named poorly.
58
20
413
IMO — Ilya is wrong - Frontier LLMs are are trained on ~200 TBs of text - There's ~200 Zettabytes of data out there - That's about 1 billion times more data - It doubles every 2 years The problem is the data is private. Can't scrape it. The problem is not data scarcity, it's
Ilya Sutskever made a rare appearance at NeurIPS. He said the internet is the fossil fuel of AI, that we are at peak data, and that 'Pre-training as we know it will unquestionably end'.
143
78
1K
fwiw - this is (more or less) my PhD thesis in a podcast. spent about 8 yrs @UniofOxford compressing what's going on in: - deep learning - cryptography - distributed systems into an alternate view of where AI is going it's *quite* different from normal AI narratives i hope
IMO — Decentralized AI is more than: - an AI model in the sky, with good external auditing - an AI model in the sky, which people vote on how to use - an AI model in the sky, which is free for anyone to use - open source AI - federated training None of these are truly an
3
10
102
nevertheless - i appreciate @uwwgo and @0xkkonrad's demo and effort. i aspire to use X to teach important AI/ML concepts like disinformation detection and this is a lovely example. hope you don't mind if i use the opportunity to share. 🙏
1
0
21