Almondgodd Profile Banner
anandmaj Profile
anandmaj

@Almondgodd

Followers
2K
Following
22K
Media
30
Statuses
256

path of childhood's end | gap @penn | prev ai @tesla_optimus @dynarobotics

sf
Joined February 2019
Don't wanna be here? Send us removal request.
@Almondgodd
anandmaj
1 month
I spent the past month reimplementing DeepMind’s Genie 3 world model from scratch Ended up making TinyWorlds, a 3M parameter world model capable of generating playable game environments demo below + everything I learned in thread (full repo at the end)👇🏼
97
269
2K
@gvnjji
gunje
1 day
When we prepped for swe jobs, we never leetcoded alone. Every session was a mock interview. That’s how we learned to think aloud and get offers. Grinding leetcode doesn’t teach you how to talk through code. That’s why we built Leo, your 24/7 mock coding interviewer that talks
3
2
5
@Almondgodd
anandmaj
5 days
what's a good nootropic for someone just getting into nootropics? (nicotine gum, racetams, L-theanine, something else worth trying?)
5
0
11
@SourishJasti
Sourish Jasti
7 days
1/ The future of general-purpose robotics will be decided by one major question: which flavor of data scales reasoning? Every major lab represents a different bet. Over the past 3 months, @adam_patni, @vriishin, and I read the core research papers, spoke with staff at the major
61
192
769
@Almondgodd
anandmaj
18 days
Scaling Era finally came and it’s a work of art with so many gems
0
0
14
@Almondgodd
anandmaj
1 month
9/ Finally, here’s tinyworlds: https://t.co/Zdf1j6y2FR It’s a minimal codebase to help people understand world modeling. Try it out yourself and make a PR, there are many easy + impactful additions to make (such as Mixture of Experts, Muon, and scaling). Thank you to @runpod_io
Tweet card summary image
github.com
A minimal implementation of DeepMind's Genie world model - AlmondGod/tinyworlds
4
9
172
@Almondgodd
anandmaj
1 month
8/ training the world generator Lastly, I trained the dynamics model that predicts the next frame. > In training, it predicts masked tokens > In inference, we add masked frames and it autoregressively decodes them. When I first trained the dynamics model, loss plateaud early
1
0
51
@Almondgodd
anandmaj
1 month
7/ training the action tokenizer The action tokenizer is the model that creates action labels and allows us to train on unlabeled video. > From raw video, it predicts the action that happened between two frames. > This lets dynamics learn to listen to actions without actually
2
0
44
@Almondgodd
anandmaj
1 month
6/ training the video tokenizer The first module, the video tokenizer, compresses videos into tokens using: > Convolutions to transform images into vectors representing each section of the image > ST transformer to let each vector share information > FS Quantization to turn the
1
1
48
@Almondgodd
anandmaj
1 month
5/ quantizing video into tokens For the video and action tokenizers, we need a quantization method to produce tokens. Tokenizers represent videos as compressed data (like zip files). They learn by finding a set of small building blocks that makes reconstructing the video easy.
1
0
54
@Almondgodd
anandmaj
1 month
4/ designing the architecture Next, I adapted Genie's high-level architecture to TinyWorlds. I considered using either: > Diffusion: where we start with noise and slowly remove it until we have a completed video sequence. > Autoregression: where we predict small chunks of video
1
0
57
@Almondgodd
anandmaj
1 month
3/ building the space-time transformer Normal transformers in LLMs understand language, which is 1D. TinyWorlds requires a model that understands video, which is 3D (height, width, time). This model also has to train quickly and learn using both actions and video. Space-time
3
0
79
@Almondgodd
anandmaj
1 month
2/ building the dataset of worlds Before training TinyWorlds, I decided what video game worlds my model should generate by building the dataset. The set of worlds the model sees in training determines what worlds it generates. I created TinyWorlds' dataset by processing
1
0
65
@Almondgodd
anandmaj
1 month
1/ understanding world models World models are neural networks that simulate physical worlds by generating videos. DeepMind’s Genie 3 proved that, just like LLMs, scaled-up world models exhibit emergent behavior: > Controllability: Pressing the right arrow makes the camera pan
1
5
90
@Almondgodd
anandmaj
1 month
The word clanker and its consequences have been a disaster for the robot race
@TheHumanoidHub
The Humanoid Hub
1 month
Dynamic control trained at SUSTech’s ACT Lab in Shenzhen.
1
0
16
@Almondgodd
anandmaj
2 months
I love meeting new people, reach out if you’re in the bay :)
1
0
19
@Almondgodd
anandmaj
2 months
Thank you so much to my incredible mentors @julianibarz @ashishkr9311 @kamalgupta09 @Yi__Li and many more.
1
0
19
@Almondgodd
anandmaj
2 months
Over the past 3 months I’ve been interning @tesla_optimus to build AGI for the real world. Robotics is the hardest frontier of AI, but it gives us a clear path to eliminating scarcity. After working on Optimus, I’ve never been more confident that universal abundance is within
32
8
368
@Almondgodd
anandmaj
2 months
I bet people in the future will do caveman to agi any% speed runs for fun
0
0
19