Harshit Sikchi (will be at NeurIPS 25) @harshit_sikchi X Profile

Harshit Sikchi (will be at NeurIPS 25)

@harshit_sikchi

Followers

2K

Following

2K

Media

46

Statuses

446

Research at @OpenAI; Reinforcement Learning; PhD from UT Austin. Previously FAIR Paris @AIatMeta, @CMU_Robotics @NVIDIAAI @UberATG.

https://t.co/5WC6sijA1d

San Francisco, CA

Joined July 2018

Don't wanna be here? Send us removal request.

Harshit Sikchi (will be at NeurIPS 25)

@harshit_sikchi

3 months

Check out GPT-5. Starting around two months ago now, was fortunate to get to contribute to something so fun!

OpenAI

@OpenAI

3 months

GPT-5 is here. Rolling out to everyone starting today. https://t.co/rOcZ8J2btI

0

14

Harshit Sikchi (will be at NeurIPS 25)

@harshit_sikchi

2 days

This work builds upon the progress made in unsupervised RL in recent years: https://t.co/tLO6kVerQ8 https://t.co/86K368o6E3 https://t.co/FYNrp0WZ9J https://t.co/dYxL1j0iCa

arxiv.org

Unsupervised zero-shot reinforcement learning (RL) has emerged as a powerful paradigm for pretraining behavioral foundation models (BFMs), enabling agents to solve a wide range of downstream tasks...

0

2

Harshit Sikchi (will be at NeurIPS 25)

@harshit_sikchi

2 days

Check out the thread below for more details: https://t.co/7evCk5WBHr This is a collaboration with @agsidd10,@JajooPranaya,@parajuli_samyak, Caleb Chuck,@maxbrudolph,@PeterStone_TX,@yayitsamyzhang,@scottniekum.

Harshit Sikchi (will be at NeurIPS 25)

@harshit_sikchi

11 months

🤖 Introducing RL Zero 🤖: a new approach to transform language into behavior zero-shot for embodied agents without labeled datasets! RL Zero enables prompt-to-policy generation, and we believe this unlocks new capabilities in scaling up language-conditioned RL, providing an

1

0

1

Harshit Sikchi (will be at NeurIPS 25)

@harshit_sikchi

2 days

Come see us at our poster session in San Diego: https://t.co/KGmDWSfjBm Fri 5 Dec 4:30 p.m. PST — 7:30 p.m. PST Want to quickly learn how it works? Check out the short talk here: Paper : https://t.co/DHGLiYuRt8 Talk: https://t.co/3suqt6tt15

1

0

3

Harshit Sikchi (will be at NeurIPS 25)

@harshit_sikchi

2 days

Announcing RLZero for Generalist Agents at #NeurIPS2025. To our knowledge, the first to enable all of: 💬 Language → behavior (zero-shot) 🎥 Video → behavior (zero-shot, cross-embodiment) 🧠 One Behavioral Foundation Model for many tasks From instructions & demos to actions—no

2

4

37

Harshit Sikchi (will be at NeurIPS 25)

@harshit_sikchi

13 days

One of the many things we reinvented and revived from RL; this one’s on policy distillation for LLM land

Shane Gu

@shaneguML

13 days

Hot take: DAgger (Ross 2011) should be the first paper you read to get into RL, instead of Sutton's book. Maybe also read scheduled sampling (Bengio 2015). And before RL, study supervised learning thoroughly.

2

21

Harshit Sikchi (will be at NeurIPS 25)

@harshit_sikchi

16 days

I am on wait and watch mode on how good this is

1X

@1x_tech

16 days

NEO The Home Robot Order Today

0

12

Harshit Sikchi (will be at NeurIPS 25)

@harshit_sikchi

22 days

Absolutely insane; these are some amazing people

Jiaxun Cui 🐿️

@cuijiaxun

22 days

Meta has gone crazy on the squid game! Many new PhD NGs are deactivated today (I am also impacted🥲 happy to chat)

1

0

20

Mark Sellke

@MarkSellke

27 days

Update: Mehtaab and I pushed further on this. Using thousands of GPT5 queries, we found solutions to 10 Erdős problems that were listed as open: 223, 339, 494, 515, 621, 822, 883 (part 2/2), 903, 1043, 1079. Additionally for 11 other problems, GPT5 found significant partial

Sebastien Bubeck

@SebastienBubeck

1 month

gpt5-pro is superhuman at literature search: it just solved Erdos Problem #339 (listed as open in the official database https://t.co/3vCCCR0cXs) by realizing that it had actually been solved 20 years ago h/t @MarkSellke for pointing this out to me!

42

92

932

Yuda Song

@yus167

30 days

🤖 Robots rarely see the true world's state—they operate on partial, noisy visual observations. How should we design algorithms under this partial observability? Should we decide (end-to-end RL) or distill (from a privileged expert)? We study this trade-off in locomotion. 🧵(1/n)

2

39

133

Harshit Sikchi (will be at NeurIPS 25)

@harshit_sikchi

1 month

Even with cool ideas, researchers often overlook how important implementation details can be. Getting these things right can be key to scaling up deep RL

Brett Barkley

@bebark99

1 month

(1/n) With over 1,300 citations, MBPO is often cited as proof that model based RL beats model free methods. In https://t.co/xq3WXslh67 we showed it often completely fails in DeepMind Control. In our new work, Fixing That Free Lunch (FTFL), we explain why and make it succeed.

0

20

Harshit Sikchi (will be at NeurIPS 25)

@harshit_sikchi

1 month

SF does really summer in October

0

1

17

Eugene Vinitsky 🦋

@EugeneVinitsky

1 month

We're finally out of stealth: https://t.co/mRieBSLG0j We're a research / engineering team working together in industries like health and logistics to ship ML tools that drastically improve productivity. If you're interested in ML and RL work that matters, take a look 😀

percepta.ai

Transforming critical institutions using applied AI. Let's harness the frontier.

15

14

99

Sebastien Bubeck

@SebastienBubeck

2 months

Yet more evidence that a pretty major shift is happening, this time by Scott Aaronson https://t.co/R1kPhCWhwD

125

454

4K

Tejal Patwardhan

@tejalpatwardhan

2 months

Understanding the capabilities of AI models is important to me. To forecast how AI models might affect labor, we need methods to measure their real-world work abilities. That’s why we created GDPval.

OpenAI

@OpenAI

2 months

Today we’re introducing GDPval, a new evaluation that measures AI on real-world, economically valuable tasks. Evals ground progress in evidence instead of speculation and help track how AI improves at the kind of work that matters most. https://t.co/uKPPDldVNS

61

190

1K

Harshit Sikchi (will be at NeurIPS 25)

@harshit_sikchi

2 months

RLZero will be presented at @NeurIPSConf 2025 . Learn more about the work in the thread below:

Harshit Sikchi (will be at NeurIPS 25)

@harshit_sikchi

11 months

🤖 Introducing RL Zero 🤖: a new approach to transform language into behavior zero-shot for embodied agents without labeled datasets! RL Zero enables prompt-to-policy generation, and we believe this unlocks new capabilities in scaling up language-conditioned RL, providing an

4

7

55

Harshit Sikchi (will be at NeurIPS 25)

@harshit_sikchi

2 months

A good way to test generalizable capability in current world of potentially contaminated datasets are competitions and we are making steady progress!

Mostafa Rohaninejad

@MostafaRohani

2 months

1/n I’m really excited to share that our @OpenAI reasoning system got a perfect score of 12/12 during the 2025 ICPC World Finals, the premier collegiate programming competition where top university teams from around the world solve complex algorithmic problems. This would have

0

9

Caroline Wang

@CarolineWang98

2 months

[1/4] 🚀 We’re excited to announce the v1 release of JaxAHT – a new library for Ad Hoc Teamwork (AHT) research, built with JAX for speed & scalability! Check it out 👉 https://t.co/Vmpbm72YwS #AI #MARL #ReinforcementLearning #JAX #AdHocTeamwork

1

7

36

Yuda Song

@yus167

2 months

LLMs lose diversity after RL post-training, and this hurts test-time scaling & creativity. Why does this collapse happen, and how can we fix it? Our new work introduces: 🔍 RL as Sampling (analysis) 🗺️ Outcome-based Exploration (intervention) [1/n]

9

88

467

Taylor W. Killian

@tw_killian

2 months

#K2Think (🏔️💭) is now live. We're proud of this model that punches well above its weights, developed primarily for mathematical reasoning but has shown itself to be quite versatile. As a fully deployed reasoning system at https://t.co/3QVlEE9MfQ you can test it for yourself!

k2think.ai

K2 Think - Advanced Reasoning Model

MBZUAI

@mbzuai

2 months

Introducing K2 Think - a breakthrough in advanced AI reasoning. Developed by MBZUAI’s Institute of Foundation Models and @G42ai, K2 Think delivers frontier reasoning performance at a fraction of the size of today’s largest systems. Smaller. Smarter. Open to the world.

13

20

117