jsuarez5341 Profile Banner
Joseph Suarez 🐔 Profile
Joseph Suarez 🐔

@jsuarez5341

Followers
17K
Following
6K
Media
622
Statuses
7K

I build sane open-source RL tools. MIT PhD, creator of Neural MMO and founder of PufferAI. https://t.co/z468O4HDxF

Joined March 2019
Don't wanna be here? Send us removal request.
@jsuarez5341
Joseph Suarez 🐔
2 months
PufferLib 3.0: We trained reinforcement learning agents on 1 Petabyte / 12,000 years of data with 1 server. Now you can, too! Our latest release includes algorithmic breakthroughs, massively faster training, and 10 new environments. Live demos on our site. Volume on for trailer!
28
96
724
@jsuarez5341
Joseph Suarez 🐔
9 hours
SOTA RL methods promise ~200x sample efficiency for >200x compute. They also only work on paper. I think I can get ~20x sample efficiency for <20x compute. This will end up being a better trade-off for most practical problems because you can spend the rest of it on more sims.
3
0
48
@grok
Grok
19 days
"A modern black glass house with warm lighting, nestled among tall trees beside a tranquil river in a misty forest.". Create images and videos in seconds with Grok Imagine.
865
729
5K
@jsuarez5341
Joseph Suarez 🐔
18 hours
PufferLib - Sample efficiency in the dumbest way possible
0
1
12
@jsuarez5341
Joseph Suarez 🐔
23 hours
I've had several people ask me if I train big RL models as a measure of competency . But once you get under the size that fits on one card, it's much more impressive to train small models!. So much more to optimize with 100k params than even 100m.
4
2
93
@jsuarez5341
Joseph Suarez 🐔
23 hours
I'm hard at work doing research for the next version!.
@yacineMTB
kache
23 hours
Really appreciate pufferlib for RL.3 months ago I had no idea how RL worked.really great reference, really simply written. I could just read through it and learn the state of the art.like it really is just the state of the art there for you to use!
Tweet media one
2
2
173
@jsuarez5341
Joseph Suarez 🐔
2 days
Nice writeup by a PufferLib contributor!.
@BoxingBytes
BoxingBytes
2 days
Start building RL with puffer on GPU for brookies today:
0
1
30
@jsuarez5341
Joseph Suarez 🐔
2 days
I am sick and tired of clever algorithms that only work on paper. Let's see how far we can push RL sample efficiency by being as dumb as possible. PufferLib's normal 850+ score solves use 90m samples. Here's my first ever experiment using 10m.
Tweet media one
2
3
144
@jsuarez5341
Joseph Suarez 🐔
3 days
PufferLib - Where do we take RL from here?
1
0
19
@jsuarez5341
Joseph Suarez 🐔
3 days
Don't slouch.Clean your room.Don't mix beer and wine. Don't train on-policy RL with off-policy data.I'm sick of rules!
Tweet media one
14
6
262
@jsuarez5341
Joseph Suarez 🐔
4 days
PufferLib - the hard problem in RL
0
1
13
@jsuarez5341
Joseph Suarez 🐔
5 days
Negative results on the last week of world modeling. I got a few things to work, but it seems the fundamental problem is a huge compute gap required vs. on-policy RL in the high data regime. The knob between compute and data needs to scale smoothly. May do an article.
7
2
74
@jsuarez5341
Joseph Suarez 🐔
5 days
PufferLib - world models get another chance
0
0
9
@jsuarez5341
Joseph Suarez 🐔
5 days
PufferLib - world models get another chance
1
0
8
@jsuarez5341
Joseph Suarez 🐔
5 days
RT @KinvertOG: @rankdim I couldn't get it to work 3 years ago but I got it working now specifically because of @jsuarez5341 's PufferLib. I….
0
1
0
@jsuarez5341
Joseph Suarez 🐔
6 days
PufferLib - world models get another chance
1
0
13
@jsuarez5341
Joseph Suarez 🐔
7 days
PufferLib - world models get another chance
1
0
9
@jsuarez5341
Joseph Suarez 🐔
8 days
PufferLib - world models get another chance
1
0
16
@jsuarez5341
Joseph Suarez 🐔
9 days
Latest article from Spencer on making a 3rd gen driving sim 10x faster and improving the RL with it!.
@spenccheng
Spencer Cheng
9 days
1
7
69
@jsuarez5341
Joseph Suarez 🐔
10 days
PufferLib - world models get another chance
1
0
19
@jsuarez5341
Joseph Suarez 🐔
11 days
PufferLib - world models get another chance
0
0
9
@jsuarez5341
Joseph Suarez 🐔
13 days
PufferLib - world models get another chance
0
0
11