Kevin Wang @kevin_wang3290 X Profile

Kevin Wang

@kevin_wang3290

Followers

219

Following

2K

Media

8

Statuses

16

CS @Princeton '22–'25, research Princeton RL + @princeton_nlp | prev quant intern @citsecurities

Princeton, NJ

Joined June 2023

Don't wanna be here? Send us removal request.

Kevin Wang

@kevin_wang3290

4 months

9/ Thanks to amazing collaborators @IJ_Apps, @m_bortkiewicz @tomasztrzcinsk1, and @ben_eysenbach. Please check out our paper and website for more details!.Paper: Website: Code:

github.com

Contribute to wang-kevin3290/scaling-crl development by creating an account on GitHub.

1

0

16

Kevin Wang

@kevin_wang3290

4 months

8/ One negative result: In preliminary experiments using OGBench, we evaluated depth scaling in offline goal-conditioned RL. We found that increasing network depth didn’t improve offline performance, hinting that its benefits online may partly arise from enhanced exploration.

1

0

9

Kevin Wang

@kevin_wang3290

4 months

7/ Prior work has found success in scaling model width. In our experiments, we also find that scaling width is helpful in CRL’s performance, but depth achieves greater performance and better parameter-efficiency (i.e. similar performance for 50× smaller models)

1

0

11

Kevin Wang

@kevin_wang3290

4 months

6/ We started this project by studying a different axis of scaling, and initially found that scaling batch size had little effect on performance. However, when revisiting the experiment, we found that scaling batch size can significantly improve performance if using deep networks

1

0

9

Kevin Wang

@kevin_wang3290

4 months

5/ Scaling network depth also yields improved generalization capabilities (stitching). When tested on start-goal pairs unseen during training, deeper networks succeeded on a higher fraction of tasks as compared with shallower networks.

1

0

8

Kevin Wang

@kevin_wang3290

4 months

4/ Deeper networks learn better contrastive representations. In this navigation task, Depth-4 networks naively approximate Q-values using Euclidean distance to the goal, while Depth-64 is able to capture the maze topology with high Q-values outlining the viable path.

1

0

9

Kevin Wang

@kevin_wang3290

4 months

3/ Scaling benefits are higher in complex tasks with high-dimensional inputs. In the Humanoid U-Maze environment where scaling effects were most prominent, we tested the limits of scaling and observed continued performance gains up to 1024 layers!

1

9

Kevin Wang

@kevin_wang3290

4 months

2/ As we scale network depth, novel behaviors emerge: at depth 4, the Humanoid simply falls toward the goal, while at depth 16 it walks upright. At depth 256 in the Humanoid U-Maze environment, a unique learned policy emerges: the agent learns to propel itself over the maze wall.

1

15

Kevin Wang

@kevin_wang3290

4 months

1/ While most RL methods use shallow MLPs (~2–5 layers), we show that scaling up to 1000-layers for contrastive RL (CRL) can significantly boost performance, ranging from doubling performance to 50x on a diverse suite of robotic tasks. Webpage+Paper+Code:

8

66

391