⿻ Andrew Trask @iamtrask X Profile

⿻ Andrew Trask

@iamtrask

Followers

79K

Following

13K

Media

63

Statuses

2K

i teach AI on X Building @openminedorg, @GoogleDeepMind, @OxfordUni. Also @UN @GovAI_ @CFR_org. I like to train federated/decentralized neural nets.

https://t.co/C8NCCqyHwk

Oxford, UK

Joined November 2012

Don't wanna be here? Send us removal request.

⿻ Andrew Trask

@iamtrask

2 months

If writing technical blogs/tutorials on AI (decentralized / federated / privacy-preserving / etc.) that get on #HackerNews / #Reddit / X sounds like a fun day job... DM me. (I would mentor you.)

6

50

⿻ Andrew Trask

@iamtrask

2 months

Thank you to the researchers at @a16z and @EpochAIResearch whose excellent reports underpin this piece in important ways.

0

9

Cameo

@BookCameo

2 days

Sleigh the season with the most personal gift around. Get them a Cameo video!

6

13

93

⿻ Andrew Trask

@iamtrask

2 months

https://t.co/fi6YGfoQSB

1

0

15

⿻ Andrew Trask

@iamtrask

2 months

I've just drafted a new blogpost "GPU demand is (~1Mx) distorted by efficiency problems which are being solved" Mid-2024, Andrej Karpathy trained GPT-2 for $20. Six months later, Andreessen Horowitz reported LLM costs falling 10x annually. Two months after that, DeepSeek

30

38

349

⿻ Andrew Trask

@iamtrask

3 months

The hottest new code library is the actual library

Andrej Karpathy

@karpathy

3 years

The hottest new programming language is English

8

154

⿻ Andrew Trask

@iamtrask

3 months

(also to be fair to Dario... we did take advantage of the moment to get a bit of press https://t.co/6yEq4gzhKk)

spectrum.ieee.org

Digital Reasoning has trained a record-breaking artificial intelligence neural network that is 14 times larger than Google's previous record

0

4

⿻ Andrew Trask

@iamtrask

3 months

(we also blew away SOTA that year... so accuracy did in fact count)

1

0

6

David Casem

@davidcasem

19 days

President Trump says it perfectly. A patchwork of 50 different state systems creates a maze of conflicting regulations, resulting in chaos. Follow me to join the conversation on leading the AI revolution.

188

283

2K

⿻ Andrew Trask

@iamtrask

3 months

Actually what he really said was closer to... "anyone can load a bunch of weights into memory... accuracy is what counts"

1

0

10

⿻ Andrew Trask

@iamtrask

3 months

the paper:

arxiv.org

Natural Language Processing (NLP) systems commonly leverage bag-of-words co-occurrence techniques to capture semantic and syntactic word relationships. The resulting word-level distributed...

1

12

⿻ Andrew Trask

@iamtrask

3 months

~10 years ago I trained a 160 billion parameter LLM and we published it as our first ICML paper I got to have drinks with the chair of ICML that year, and I remember bragging about size... he quickly corrected me "Size doesn't matter... accuracy does" This reminded me of that

Rohan Paul

@rohanpaul_ai

3 months

"100 million words context window is already possible, which is roughly what a human hears in a lifetime. Inference support is the only bottleneck to achieve it. And AI Models actually do learn during the context window, without changing the weights." ~ Anthropic CEO Dario

22

13

229

⿻ Andrew Trask

@iamtrask

3 months

https://t.co/qgj41CRQ09

1

2

20

⿻ Andrew Trask

@iamtrask

3 months

I've just drafted a new blogpost "The Bitter Lesson's Bitter Lesson" Richard Sutton and Dwarkesh discussed the Bitter Lesson, where Richard argued that babies and animals don’t learn through imitation, so state-of-the-art LLMs are pursuing the wrong path by imitating humans

Richard Sutton

@RichardSSutton

3 months

Dwarkesh and I had a frank exchange of views. I hope we moved the conversation forward. Dwarkesh is a true gentleman.

77

103

977

⿻ Andrew Trask

@iamtrask

3 months

For example, If you have dots on an XY-plane, and you fit a model to them... drawing a line between the dots... you can then use that model to generate more dots which didn't exist before. And if you train a model on that new line + dots, the model can be *smaller* and *more

6

0

13

Human Appeal USA

@HumanAppealUSA

2 months

Give him stability care and dignity

1

9

36

⿻ Andrew Trask

@iamtrask

3 months

Refinement: synthetic data is a compressed version of the original training data. When synthetic data does a good job, that lossy compression removes noise. But the synthetic data isn't "smarter" than the original. It's just easier to train on because it's better

8

1

19

⿻ Andrew Trask

@iamtrask

3 months

IMO — biggest misunderstanding in AI right now... Synthetic data isn't synthetic data. It's a cleaner version of the original data. It was named poorly.

58

20

413

⿻ Andrew Trask

@iamtrask

3 months

https://t.co/lJMsd0uXYf

ifp.org

How a new ARPANET-style program could solve the data accessibility problem

6

7

71

⿻ Andrew Trask

@iamtrask

3 months

IMO — Ilya is wrong - Frontier LLMs are are trained on ~200 TBs of text - There's ~200 Zettabytes of data out there - That's about 1 billion times more data - It doubles every 2 years The problem is the data is private. Can't scrape it. The problem is not data scarcity, it's

Andrew Curran

@AndrewCurran_

1 year

Ilya Sutskever made a rare appearance at NeurIPS. He said the internet is the fossil fuel of AI, that we are at peak data, and that 'Pre-training as we know it will unquestionably end'.

143

78

1K

UFC

@ufc

18 days

A new way to watch UFC

0

106

2K

⿻ Andrew Trask

@iamtrask

3 months

fwiw - this is (more or less) my PhD thesis in a podcast. spent about 8 yrs @UniofOxford compressing what's going on in: - deep learning - cryptography - distributed systems into an alternate view of where AI is going it's *quite* different from normal AI narratives i hope

⿻ Andrew Trask

@iamtrask

3 months

IMO — Decentralized AI is more than: - an AI model in the sky, with good external auditing - an AI model in the sky, which people vote on how to use - an AI model in the sky, which is free for anyone to use - open source AI - federated training None of these are truly an

3

10

102

⿻ Andrew Trask

@iamtrask

3 months

nevertheless - i appreciate @uwwgo and @0xkkonrad's demo and effort. i aspire to use X to teach important AI/ML concepts like disinformation detection and this is a lovely example. hope you don't mind if i use the opportunity to share. 🙏

1

0

21