Deepak Pathak @pathak2206 X Profile

Deepak Pathak

@pathak2206

Followers

22K

Following

1K

Media

157

Statuses

624

Co-Founder & CEO at @SkildAI, Faculty at @CarnegieMellon. PhD @UCBerkeley. I study topics in AI (machine learning, robotics & computer vision).

Pittsburgh, PA

Joined May 2013

Don't wanna be here? Send us removal request.

Deepak Pathak

@pathak2206

2 years

Even after 4yrs of locomotion research, we keep getting surprised by how far we can push the limits of legged robots! We report a major update 🚀🤖. Extreme Parkour: extremely long & high jumps, ramp, handstand, etc. all with a single neural net!. 🧵(1/n)

29

232

1K

Deepak Pathak

@pathak2206

3 hours

RT @_tonytao_: Want to add diverse, high-quality data to your robot policy?. Happy to share that the DexWild Dataset is now fully public, h….

0

6

0

Deepak Pathak

@pathak2206

8 days

RT @TheHumanoidHub: Got to visit the Robotics Institute at CMU today. The institute has a long legacy of pioneering research and pushing t….

0

22

0

Deepak Pathak

@pathak2206

8 days

A great example of scientific discourse at its best—thoughtful, constructive, and conclusive. We now have more rigorous evidence that confidence maximization improves reasoning. 👇.

Mihir Prabhudesai

@mihirp98

9 days

1/ Maximizing confidence indeed improves reasoning. We worked with @ShashwatGoel7, @nikhilchandak29 @AmyPrb for the past 3 weeks (over a zoom call and many emails!) and revised our evaluations to align with their suggested prompts/parsers/sampling params. This includes changing

1

20

Deepak Pathak

@pathak2206

9 days

RT @ShashwatGoel7: Glad we could together improve the scientific discourse around reasoning. Was great to see the authors reach out and inc….

0

4

0

Deepak Pathak

@pathak2206

16 days

Congratulations to the team. great start at RSS!!. We have open-sourced DexWild -- makes it easy to build and scale robot learning with hands:

Tony Tao

@_tonytao_

18 days

Thrilled to have received Best Paper Award at the EgoAct Workshop at RSS 2025! 🏆. We’ll also be giving a talk at the Imitation Learning Session I tomorrow, 5:30–6:30pm. Come to learn about DexWild!. Work co-led by @mohansrirama, with @JasonJZLiu, @kenny__shaw, and @pathak2206.

1

4

66

Deepak Pathak

@pathak2206

17 days

RT @JasonJZLiu: Presenting FACTR today at #RSS2025 in the Imitation Learning I session at 5:30pm (June 22). Come by if you're interested in….

0

12

0

Deepak Pathak

@pathak2206

21 days

Tired of tuning PPO or blaming it on reward, task design, etc.? Introducing EPO -- our second (and hopefully final :) attempt at fixing PPO at scale!. Contrary to intuition, as the batch size or data increases, PPO saturates due to a lack of diversity in sampling. We proposed a.

Jianren Wang

@wang_jianren

22 days

(1/n) Since its publication in 2017, PPO has essentially become synonymous with RL. Today, we are excited to provide you with a better alternative - EPO.

2

8

101

Deepak Pathak

@pathak2206

28 days

RT @stevenl: I’m thrilled to announce the launch of my $40M pre-seed and seed-stage fund, @SevenStars_VC, where I’ll be focused on partneri….

0

41

0

Deepak Pathak

@pathak2206

1 month

Also, check out wonderful concurrent work (came out yesterday) from our friends at Berkeley @xuandongzhao @dawnsongtweets and team -- similar ideas but experiments are complementary, nuanced findings in both:.

Xuandong Zhao

@xuandongzhao

1 month

🚀 Excited to share the most inspiring work I’ve been part of this year:. "Learning to Reason without External Rewards". TL;DR: We show that LLMs can learn complex reasoning without access to ground-truth answers, simply by optimizing their own internal sense of confidence. 1/n

1

0

7

Deepak Pathak

@pathak2206

1 month

Maximizing Confidence Alone Improves Reasoning. Feels like the start of the "curiosity-driven learning" era for LLMs. I have spent most of my career towards building agents that can self-improve without any external rewards (e.g., curiosity work during Phd and then at CMU).

Mihir Prabhudesai

@mihirp98

1 month

Excited to share our work: Maximizing Confidence Alone Improves Reasoning. Humans rely on confidence to learn when answer keys aren’t available (e.g taking an exam). Surprisingly, LLMs can also learn w/o ground-truth answers, simply by reinforcing high-confidence answers via RL!

4

12

77

Deepak Pathak

@pathak2206

1 month

RT @mihirp98: Excited to share our work: Maximizing Confidence Alone Improves Reasoning. Humans rely on confidence to learn when answer key….

0

37

0

Deepak Pathak

@pathak2206

2 months

Congratulations, Dr. Murtaza! 🥳.

Murtaza Dalal

@mihdalal

2 months

Incredibly excited to share that I am now officially Dr. Murtaza Dalal! Last weekend marked the official end of an incredible journey across the last 5 years, including doing the first year of my PhD remote, moving to the other side of the country, becoming an independent

1

45

Deepak Pathak

@pathak2206

2 months

RT @khoomeik: someone should probably retry all those late 2010s deep RL ideas to see if they work on LLMs

0

149

0

Deepak Pathak

@pathak2206

2 months

RT @mohansrirama: Maybe real-world robot generalization doesn’t need massive teleop datasets? 🤔. In DexWild, we show that human demos 🙌 + a….

0

12

0

Deepak Pathak

@pathak2206

2 months

Inspired by amazing UMI work from @SongShuran 's lab!.

0

7

Deepak Pathak

@pathak2206

2 months

Introducing DexWild -- a scalable approach to diverse "in the wild" data collection for dexterous robotic hands! This data can be used to co-train policy for any downstream robotic hands on any body form factor (humanoids, AMR with arms, etc). 🚀🤖.

Tony Tao

@_tonytao_

2 months

Training robots for the open world needs diverse data. But collecting robot demos in the wild is hard!. Presenting DexWild.🙌🏕️ Human data collection system that works in diverse environments, without robots.💪🦾 Human + Robot Cotraining pipeline that unlocks generalization. 🧵👇

3

10

69

Deepak Pathak

@pathak2206

2 months

RT @kenny__shaw: Exiciting to see (at 5:55) Nvidia adopting LEAP Hand in their sim2real efforts!. Build your own at .

0

8

0

Deepak Pathak

@pathak2206

2 months

LEAP Hand controlled by DOGlove. extremely low-cost dexterity!! Very cool.

Ilir Aliu - eu/acc

@IlirAliu_

2 months

What if anyone could build a high-quality haptic glove for robot control… in just a weekend?. [github & arXiv ⬇️]. This team did it. DOGlove is a fully open-source glove that brings precision, force feedback, and dexterity to robot teleoperation;. All for under $600. ✅ Tracks

1

5

54

Deepak Pathak

@pathak2206

2 months

RT @alexlioralexli: Excited to be presenting at #ICLR2025 at 10am today on how generative classifiers are much more robust to distribution….

0

6

0

Deepak Pathak

@pathak2206

3 months

Very excited about this direction -- a unified discrete diffusion model for joint text & image generation. Unlike popular autoregressive multimodal approaches, unified diffusion framework unlocks faster inference, better control via guidance, flexible compute-quality tradeoff,.

Mihir Prabhudesai

@mihirp98

3 months

1/ Happy to share UniDisc - Unified Multimodal Discrete Diffusion – We train a 1.5 billion parameter transformer model from scratch on 250 million image/caption pairs using a **discrete diffusion objective**. Our model has all the benefits of diffusion models but now in

3

18

169