Kevin Wang Profile
Kevin Wang

@kevin_wang3290

Followers
198
Following
2K
Media
8
Statuses
12

CS @Princeton '22–'25, research Princeton RL + @princeton_nlp | prev quant intern @citsecurities

Princeton, NJ
Joined June 2023
Don't wanna be here? Send us removal request.
@kevin_wang3290
Kevin Wang
4 months
9/ Thanks to amazing collaborators @IJ_Apps, @m_bortkiewicz @tomasztrzcinsk1, and @ben_eysenbach. Please check out our paper and website for more details!.Paper: Website: Code:
1
0
16
@kevin_wang3290
Kevin Wang
4 months
8/ One negative result: In preliminary experiments using OGBench, we evaluated depth scaling in offline goal-conditioned RL. We found that increasing network depth didn’t improve offline performance, hinting that its benefits online may partly arise from enhanced exploration.
1
0
9
@kevin_wang3290
Kevin Wang
4 months
7/ Prior work has found success in scaling model width. In our experiments, we also find that scaling width is helpful in CRL’s performance, but depth achieves greater performance and better parameter-efficiency (i.e. similar performance for 50× smaller models)
Tweet media one
1
0
11
@kevin_wang3290
Kevin Wang
4 months
6/ We started this project by studying a different axis of scaling, and initially found that scaling batch size had little effect on performance. However, when revisiting the experiment, we found that scaling batch size can significantly improve performance if using deep networks
Tweet media one
1
0
9
@kevin_wang3290
Kevin Wang
4 months
5/ Scaling network depth also yields improved generalization capabilities (stitching). When tested on start-goal pairs unseen during training, deeper networks succeeded on a higher fraction of tasks as compared with shallower networks.
Tweet media one
1
0
8
@kevin_wang3290
Kevin Wang
4 months
4/ Deeper networks learn better contrastive representations. In this navigation task, Depth-4 networks naively approximate Q-values using Euclidean distance to the goal, while Depth-64 is able to capture the maze topology with high Q-values outlining the viable path.
Tweet media one
1
0
9
@kevin_wang3290
Kevin Wang
4 months
3/ Scaling benefits are higher in complex tasks with high-dimensional inputs. In the Humanoid U-Maze environment where scaling effects were most prominent, we tested the limits of scaling and observed continued performance gains up to 1024 layers!
Tweet media one
Tweet media two
1
1
9
@kevin_wang3290
Kevin Wang
4 months
2/ As we scale network depth, novel behaviors emerge: at depth 4, the Humanoid simply falls toward the goal, while at depth 16 it walks upright. At depth 256 in the Humanoid U-Maze environment, a unique learned policy emerges: the agent learns to propel itself over the maze wall.
Tweet media one
1
1
15
@kevin_wang3290
Kevin Wang
4 months
1/ While most RL methods use shallow MLPs (~2–5 layers), we show that scaling up to 1000-layers for contrastive RL (CRL) can significantly boost performance, ranging from doubling performance to 50x on a diverse suite of robotic tasks. Webpage+Paper+Code:
8
64
387