harshit_sikchi Profile Banner
Harshit Sikchi (will be at NeurIPS 25) Profile
Harshit Sikchi (will be at NeurIPS 25)

@harshit_sikchi

Followers
2K
Following
2K
Media
46
Statuses
446

Research at @OpenAI; Reinforcement Learning; PhD from UT Austin. Previously FAIR Paris @AIatMeta, @CMU_Robotics @NVIDIAAI @UberATG.

San Francisco, CA
Joined July 2018
Don't wanna be here? Send us removal request.
@harshit_sikchi
Harshit Sikchi (will be at NeurIPS 25)
3 months
Check out GPT-5. Starting around two months ago now, was fortunate to get to contribute to something so fun!
@OpenAI
OpenAI
3 months
GPT-5 is here. Rolling out to everyone starting today. https://t.co/rOcZ8J2btI
0
0
14
@harshit_sikchi
Harshit Sikchi (will be at NeurIPS 25)
2 days
Check out the thread below for more details: https://t.co/7evCk5WBHr This is a collaboration with @agsidd10,@JajooPranaya,@parajuli_samyak, Caleb Chuck,@maxbrudolph,@PeterStone_TX,@yayitsamyzhang,@scottniekum.
@harshit_sikchi
Harshit Sikchi (will be at NeurIPS 25)
11 months
🤖 Introducing RL Zero 🤖: a new approach to transform language into behavior zero-shot for embodied agents without labeled datasets! RL Zero enables prompt-to-policy generation, and we believe this unlocks new capabilities in scaling up language-conditioned RL, providing an
1
0
1
@harshit_sikchi
Harshit Sikchi (will be at NeurIPS 25)
2 days
Come see us at our poster session in San Diego: https://t.co/KGmDWSfjBm Fri 5 Dec 4:30 p.m. PST — 7:30 p.m. PST Want to quickly learn how it works? Check out the short talk here: Paper : https://t.co/DHGLiYuRt8 Talk: https://t.co/3suqt6tt15
1
0
3
@harshit_sikchi
Harshit Sikchi (will be at NeurIPS 25)
2 days
Announcing RLZero for Generalist Agents at #NeurIPS2025. To our knowledge, the first to enable all of: 💬 Language → behavior (zero-shot) 🎥 Video → behavior (zero-shot, cross-embodiment) 🧠 One Behavioral Foundation Model for many tasks From instructions & demos to actions—no
2
4
37
@harshit_sikchi
Harshit Sikchi (will be at NeurIPS 25)
13 days
One of the many things we reinvented and revived from RL; this one’s on policy distillation for LLM land
@shaneguML
Shane Gu
13 days
Hot take: DAgger (Ross 2011) should be the first paper you read to get into RL, instead of Sutton's book. Maybe also read scheduled sampling (Bengio 2015). And before RL, study supervised learning thoroughly.
2
2
21
@harshit_sikchi
Harshit Sikchi (will be at NeurIPS 25)
16 days
I am on wait and watch mode on how good this is
@1x_tech
1X
16 days
NEO The Home Robot Order Today
0
0
12
@harshit_sikchi
Harshit Sikchi (will be at NeurIPS 25)
22 days
Absolutely insane; these are some amazing people
@cuijiaxun
Jiaxun Cui 🐿️
22 days
Meta has gone crazy on the squid game! Many new PhD NGs are deactivated today (I am also impacted🥲 happy to chat)
1
0
20
@MarkSellke
Mark Sellke
27 days
Update: Mehtaab and I pushed further on this. Using thousands of GPT5 queries, we found solutions to 10 Erdős problems that were listed as open: 223, 339, 494, 515, 621, 822, 883 (part 2/2), 903, 1043, 1079. Additionally for 11 other problems, GPT5 found significant partial
@SebastienBubeck
Sebastien Bubeck
1 month
gpt5-pro is superhuman at literature search: it just solved Erdos Problem #339 (listed as open in the official database https://t.co/3vCCCR0cXs) by realizing that it had actually been solved 20 years ago h/t @MarkSellke for pointing this out to me!
42
92
932
@yus167
Yuda Song
30 days
🤖 Robots rarely see the true world's state—they operate on partial, noisy visual observations. How should we design algorithms under this partial observability? Should we decide (end-to-end RL) or distill (from a privileged expert)? We study this trade-off in locomotion. 🧵(1/n)
2
39
133
@harshit_sikchi
Harshit Sikchi (will be at NeurIPS 25)
1 month
Even with cool ideas, researchers often overlook how important implementation details can be. Getting these things right can be key to scaling up deep RL
@bebark99
Brett Barkley
1 month
(1/n) With over 1,300 citations, MBPO is often cited as proof that model based RL beats model free methods. In https://t.co/xq3WXslh67 we showed it often completely fails in DeepMind Control. In our new work, Fixing That Free Lunch (FTFL), we explain why and make it succeed.
0
0
20
@harshit_sikchi
Harshit Sikchi (will be at NeurIPS 25)
1 month
SF does really summer in October
0
1
17
@EugeneVinitsky
Eugene Vinitsky 🦋
1 month
We're finally out of stealth: https://t.co/mRieBSLG0j We're a research / engineering team working together in industries like health and logistics to ship ML tools that drastically improve productivity. If you're interested in ML and RL work that matters, take a look 😀
percepta.ai
Transforming critical institutions using applied AI. Let's harness the frontier.
15
14
99
@SebastienBubeck
Sebastien Bubeck
2 months
Yet more evidence that a pretty major shift is happening, this time by Scott Aaronson https://t.co/R1kPhCWhwD
125
454
4K
@tejalpatwardhan
Tejal Patwardhan
2 months
Understanding the capabilities of AI models is important to me. To forecast how AI models might affect labor, we need methods to measure their real-world work abilities. That’s why we created GDPval.
@OpenAI
OpenAI
2 months
Today we’re introducing GDPval, a new evaluation that measures AI on real-world, economically valuable tasks. Evals ground progress in evidence instead of speculation and help track how AI improves at the kind of work that matters most. https://t.co/uKPPDldVNS
61
190
1K
@harshit_sikchi
Harshit Sikchi (will be at NeurIPS 25)
2 months
RLZero will be presented at @NeurIPSConf 2025 . Learn more about the work in the thread below:
@harshit_sikchi
Harshit Sikchi (will be at NeurIPS 25)
11 months
🤖 Introducing RL Zero 🤖: a new approach to transform language into behavior zero-shot for embodied agents without labeled datasets! RL Zero enables prompt-to-policy generation, and we believe this unlocks new capabilities in scaling up language-conditioned RL, providing an
4
7
55
@harshit_sikchi
Harshit Sikchi (will be at NeurIPS 25)
2 months
A good way to test generalizable capability in current world of potentially contaminated datasets are competitions and we are making steady progress!
@MostafaRohani
Mostafa Rohaninejad
2 months
1/n I’m really excited to share that our @OpenAI reasoning system got a perfect score of 12/12 during the 2025 ICPC World Finals, the premier collegiate programming competition where top university teams from around the world solve complex algorithmic problems. This would have
0
0
9
@CarolineWang98
Caroline Wang
2 months
[1/4] 🚀 We’re excited to announce the v1 release of JaxAHT – a new library for Ad Hoc Teamwork (AHT) research, built with JAX for speed & scalability! Check it out 👉 https://t.co/Vmpbm72YwS #AI #MARL #ReinforcementLearning #JAX #AdHocTeamwork
1
7
36
@yus167
Yuda Song
2 months
LLMs lose diversity after RL post-training, and this hurts test-time scaling & creativity. Why does this collapse happen, and how can we fix it? Our new work introduces: 🔍 RL as Sampling (analysis) 🗺️ Outcome-based Exploration (intervention) [1/n]
9
88
467
@tw_killian
Taylor W. Killian
2 months
#K2Think (🏔️💭) is now live. We're proud of this model that punches well above its weights, developed primarily for mathematical reasoning but has shown itself to be quite versatile. As a fully deployed reasoning system at https://t.co/3QVlEE9MfQ you can test it for yourself!
Tweet card summary image
k2think.ai
K2 Think - Advanced Reasoning Model
@mbzuai
MBZUAI
2 months
Introducing K2 Think - a breakthrough in advanced AI reasoning. Developed by MBZUAI’s Institute of Foundation Models and @G42ai, K2 Think delivers frontier reasoning performance at a fraction of the size of today’s largest systems. Smaller. Smarter. Open to the world.
13
20
117