Deepak Pathak Profile
Deepak Pathak

@pathak2206

Followers
22K
Following
1K
Media
157
Statuses
624

Co-Founder & CEO at @SkildAI, Faculty at @CarnegieMellon. PhD @UCBerkeley. I study topics in AI (machine learning, robotics & computer vision).

Pittsburgh, PA
Joined May 2013
Don't wanna be here? Send us removal request.
@pathak2206
Deepak Pathak
2 years
Even after 4yrs of locomotion research, we keep getting surprised by how far we can push the limits of legged robots! We report a major update 🚀🤖. Extreme Parkour: extremely long & high jumps, ramp, handstand, etc. all with a single neural net!. 🧵(1/n)
29
232
1K
@pathak2206
Deepak Pathak
3 hours
RT @_tonytao_: Want to add diverse, high-quality data to your robot policy?. Happy to share that the DexWild Dataset is now fully public, h….
0
6
0
@pathak2206
Deepak Pathak
8 days
RT @TheHumanoidHub: Got to visit the Robotics Institute at CMU today. The institute has a long legacy of pioneering research and pushing t….
0
22
0
@pathak2206
Deepak Pathak
8 days
A great example of scientific discourse at its best—thoughtful, constructive, and conclusive. We now have more rigorous evidence that confidence maximization improves reasoning. 👇.
@mihirp98
Mihir Prabhudesai
9 days
1/ Maximizing confidence indeed improves reasoning. We worked with @ShashwatGoel7, @nikhilchandak29 @AmyPrb for the past 3 weeks (over a zoom call and many emails!) and revised our evaluations to align with their suggested prompts/parsers/sampling params. This includes changing
Tweet media one
1
1
20
@pathak2206
Deepak Pathak
9 days
RT @ShashwatGoel7: Glad we could together improve the scientific discourse around reasoning. Was great to see the authors reach out and inc….
0
4
0
@pathak2206
Deepak Pathak
16 days
Congratulations to the team. great start at RSS!!. We have open-sourced DexWild -- makes it easy to build and scale robot learning with hands:
@_tonytao_
Tony Tao
18 days
Thrilled to have received Best Paper Award at the EgoAct Workshop at RSS 2025! 🏆. We’ll also be giving a talk at the Imitation Learning Session I tomorrow, 5:30–6:30pm. Come to learn about DexWild!. Work co-led by @mohansrirama, with @JasonJZLiu, @kenny__shaw, and @pathak2206.
Tweet media one
1
4
66
@pathak2206
Deepak Pathak
17 days
RT @JasonJZLiu: Presenting FACTR today at #RSS2025 in the Imitation Learning I session at 5:30pm (June 22). Come by if you're interested in….
0
12
0
@pathak2206
Deepak Pathak
21 days
Tired of tuning PPO or blaming it on reward, task design, etc.? Introducing EPO -- our second (and hopefully final :) attempt at fixing PPO at scale!. Contrary to intuition, as the batch size or data increases, PPO saturates due to a lack of diversity in sampling. We proposed a.
@wang_jianren
Jianren Wang
22 days
(1/n) Since its publication in 2017, PPO has essentially become synonymous with RL. Today, we are excited to provide you with a better alternative - EPO.
2
8
101
@pathak2206
Deepak Pathak
28 days
RT @stevenl: I’m thrilled to announce the launch of my $40M pre-seed and seed-stage fund, @SevenStars_VC, where I’ll be focused on partneri….
0
41
0
@pathak2206
Deepak Pathak
1 month
Also, check out wonderful concurrent work (came out yesterday) from our friends at Berkeley @xuandongzhao @dawnsongtweets and team -- similar ideas but experiments are complementary, nuanced findings in both:.
@xuandongzhao
Xuandong Zhao
1 month
🚀 Excited to share the most inspiring work I’ve been part of this year:. "Learning to Reason without External Rewards". TL;DR: We show that LLMs can learn complex reasoning without access to ground-truth answers, simply by optimizing their own internal sense of confidence. 1/n
Tweet media one
1
0
7
@pathak2206
Deepak Pathak
1 month
Maximizing Confidence Alone Improves Reasoning. Feels like the start of the "curiosity-driven learning" era for LLMs. I have spent most of my career towards building agents that can self-improve without any external rewards (e.g., curiosity work during Phd and then at CMU).
Tweet media one
@mihirp98
Mihir Prabhudesai
1 month
Excited to share our work: Maximizing Confidence Alone Improves Reasoning. Humans rely on confidence to learn when answer keys aren’t available (e.g taking an exam). Surprisingly, LLMs can also learn w/o ground-truth answers, simply by reinforcing high-confidence answers via RL!
4
12
77
@pathak2206
Deepak Pathak
1 month
RT @mihirp98: Excited to share our work: Maximizing Confidence Alone Improves Reasoning. Humans rely on confidence to learn when answer key….
0
37
0
@pathak2206
Deepak Pathak
2 months
Congratulations, Dr. Murtaza! 🥳.
@mihdalal
Murtaza Dalal
2 months
Incredibly excited to share that I am now officially Dr. Murtaza Dalal! Last weekend marked the official end of an incredible journey across the last 5 years, including doing the first year of my PhD remote, moving to the other side of the country, becoming an independent
Tweet media one
Tweet media two
Tweet media three
1
1
45
@pathak2206
Deepak Pathak
2 months
RT @khoomeik: someone should probably retry all those late 2010s deep RL ideas to see if they work on LLMs
Tweet media one
0
149
0
@pathak2206
Deepak Pathak
2 months
RT @mohansrirama: Maybe real-world robot generalization doesn’t need massive teleop datasets? 🤔. In DexWild, we show that human demos 🙌 + a….
0
12
0
@pathak2206
Deepak Pathak
2 months
Inspired by amazing UMI work from @SongShuran 's lab!.
0
0
7
@pathak2206
Deepak Pathak
2 months
Introducing DexWild -- a scalable approach to diverse "in the wild" data collection for dexterous robotic hands! This data can be used to co-train policy for any downstream robotic hands on any body form factor (humanoids, AMR with arms, etc). 🚀🤖.
@_tonytao_
Tony Tao
2 months
Training robots for the open world needs diverse data. But collecting robot demos in the wild is hard!. Presenting DexWild.🙌🏕️ Human data collection system that works in diverse environments, without robots.💪🦾 Human + Robot Cotraining pipeline that unlocks generalization. 🧵👇
3
10
69
@pathak2206
Deepak Pathak
2 months
RT @kenny__shaw: Exiciting to see (at 5:55) Nvidia adopting LEAP Hand in their sim2real efforts!. Build your own at .
0
8
0
@pathak2206
Deepak Pathak
2 months
LEAP Hand controlled by DOGlove. extremely low-cost dexterity!! Very cool.
@IlirAliu_
Ilir Aliu - eu/acc
2 months
What if anyone could build a high-quality haptic glove for robot control… in just a weekend?. [github & arXiv ⬇️]. This team did it. DOGlove is a fully open-source glove that brings precision, force feedback, and dexterity to robot teleoperation;. All for under $600. ✅ Tracks
1
5
54
@pathak2206
Deepak Pathak
2 months
RT @alexlioralexli: Excited to be presenting at #ICLR2025 at 10am today on how generative classifiers are much more robust to distribution….
0
6
0
@pathak2206
Deepak Pathak
3 months
Very excited about this direction -- a unified discrete diffusion model for joint text & image generation. Unlike popular autoregressive multimodal approaches, unified diffusion framework unlocks faster inference, better control via guidance, flexible compute-quality tradeoff,.
@mihirp98
Mihir Prabhudesai
3 months
1/ Happy to share UniDisc - Unified Multimodal Discrete Diffusion – We train a 1.5 billion parameter transformer model from scratch on 250 million image/caption pairs using a **discrete diffusion objective**. Our model has all the benefits of diffusion models but now in
3
18
169