Oscar Michel
@ojmichel4
Followers
497
Following
145
Media
10
Statuses
26
PhD student at NYU researching world models
Joined November 2021
š¢Current world models aren't really modeling the world; they're modeling one agent's view of it. Partial observations ā world state. Future world models will be independent of any one agent's perspective. You will be able to ādrop inā any number of agents at any point in time,
13
87
613
iām joining forces with @ylecun and an incredible group of people to start AMI Labs @amilabs. AMI isnāt a conventional lab. we donāt intend to become one. a lot to say about why this moment matters, but for now weāre heads down building. join us:
amilabs.xyz
AMI - Advanced Machine Intelligence - builds world-model-based AI that understands the real world. We develop safe, controllable intelligent systems for industry, robotics, healthcare, and beyond.
Advanced Machine Intelligence (AMI) is building a new breed of AI systems that understand the world, have persistent memory, can reason and plan, and are controllable and safe. Weāve raised a $1.03B (~ā¬890M) round from global investors who believe in our vision of universally
153
162
3K
Train Beyond Language. We bet on the visual world as the critical next step alongside and beyond language modeling. So, we studied building foundation models from scratch with vision. We share our exploration: visual representations, data, world modeling, architecture, and
34
222
1K
šŖHere are some of my favorite results. On top, the model has learned to decrease the inventory counter when placing blocks. On the bottom, an emergent effect: rain starts simultaneously for both players, showing the model has learned a shared world state. āØFor many more
1
3
38
Self Forcing gives a huge improvement in quality! Here we see the same PvP sequence before and after Checkpointed Self Forcing. [9/10]
3
1
33
We also introduce Checkpointed Self Forcing, a memory-efficient technique for applying Self Forcing when the teacher's context is longer than the student's. Ordinarily this causes excessive memory usage during backpropagation (a problem also noted by concurrent work RELIC). Our
1
1
14
We train in stages. 1ļøā£First, we finetune a pre-trained DiT on single-player Minecraft data from VPT. This step is critical for learning core Minecraft dynamics (we show this experimentally). 2ļøā£Then we introduce multiplayer data and train a bidirectional teacher. 3ļøā£From it we
1
2
17
Our architecture starts from a pre-trained single-player model (MatrixGame 2.0) and adapts it to multiplayer with a simple change: extend self-attention to all player tokens. Everything else stays the same and is applied independently per player. This means we can initialize
1
1
15
We used SolarisEngine to collect 12.6 million frames of multiplayer gameplay in less than 12 hours on just two on two RTX 4090 machines. Here's a sample of what it looks like (each column is a different episode type). [5/10]
2
1
17
Multiplayer data is hard to get. There are great existing frameworks for AI in Minecraft, but none could generate realistic multiplayer gameplay with aligned actions and observations out of the box. So we built SolarisEngine to do exactly that. It pairs action bots with camera
1
3
16
Why Minecraft? It's open-ended, 3D, and contains infinite challenging tasks. People literally build working computers in it. It's a natural testbed for some of the hardest open problems in AI: spatial reasoning, continual learning, multiagent collaboration, and more. And it's
1
0
21
Toward this, we built Solaris, a multiplayer video world model in Minecraft. It generates action-conditioned videos of two players playing Minecraft in a shared world and can simulate multiplayer gameplay including movement, building, mining, fighting and more. In conjunction we
1
1
40
three years ago, DiT replaced the legacy unet with a transformer-based denoising backbone. we knew the bulky VAEs would be the next to go -- we just waited until we could do it right. today, we introduce Representation Autoencoders (RAE). >> Retire VAEs. Use RAEs. š(1/n)
57
324
2K
Excited to introduce DiffuseNNX, a comprehensive JAX/Flax NNX-based library for diffusion and flow matching! It supports multiple diffusion / flow-matching frameworks, Autoencoders, DiT variants, and sampling algorithms. Repo: https://t.co/zOcA6nyrcM Delve into details below!
github.com
A comprehensive JAX/NNX library for diffusion and flow matching generative algorithms, featuring DiT (Diffusion Transformer) and its variants as the primary backbone with support for ImageNet train...
4
50
220
In recent days, the agreement to begin the peace process has brought a spark of hope to the Holy Land. I encourage the parties involved to continue courageously along the path that has been set out, toward a just, lasting, and respectful #Peace that honors the legitimate
617
1K
8K