Jianren Wang
@wang_jianren
Followers
1K
Following
104
Media
8
Statuses
151
Student of Nature | PhD @CMU_Robotics | Founding Researcher @SkildAI
Pittsburgh, PA
Joined July 2020
(1/n) Since its publication in 2017, PPO has essentially become synonymous with RL. Today, we are excited to provide you with a better alternative - EPO.
16
83
702
Haozhi has set an incredibly high standard for dexterous manipulation and robot learning. His work has been an inspiration to many, myself included. Congratulations—this is very well deserved!
I will join UChicago CS @UChicagoCS as an Assistant Professor in late 2026, and I’m recruiting PhD students in this cycle (2025 - 2026). My research focuses on AI & Robotics - including dexterous manipulation, humanoids, tactile sensing, learning from human videos, robot
1
5
43
Congratulations on the great work! I hope we can all begin advocating for open-progress research.
Zero teleoperation. Zero real-world data. ➔ Autonomous humanoid loco-manipulation in reality. Introducing VIRAL: Visual Sim-to-Real at Scale. We achieved 54 autonomous cycles (walk, stand, place, pick, turn) using a simple recipe: 1. RL 2. Simulation 3. GPUs Website:
0
1
10
A key step for intelligence is the ability to automate its own training pipeline. Together with @NVIDIA, Skild AI is making significant moves toward this self-reproduction loop.
Proud to partner with NVIDIA and Foxconn to automate @NVIDIA's Houston GPU Factory and drive the next era of American manufacturing!
0
0
7
Huge congratulations! It's absolutely well-deserved!
Incredible news. Neural MP has won the Best Student Paper award at IROS 2025!! Congratulations to @mihdalal & @Jiahui_Yang6709 for leading the project along with @mendonca_rl, youssef, @rsalakhu. Neural MP is a major step in making motion planning end-to-end, fast & reactive.
0
0
5
A huge congratulations, friend! I always enjoy reading your papers. If you're applying for a PhD in robotics and vision, I can't recommend working with Professor Bharadhwaj enough. You'll find it a truly rewarding and enjoyable experience!
I'll be joining the faculty @JohnsHopkins late next year as a tenure-track assistant professor in @JHUCompSci Looking for PhD students to join me tackling fun problems in robot manipulation, learning from human data, understanding+predicting physical interactions, and beyond!
1
2
63
This is a strong case for any robot, one brain.I once held that such generalizability did not exist in nature—until they reminded me of the caterpillar. Every day, I feel proud of what we are building here. To the god of AGI: I bow to my robot each day, so the person isn’t me😜.
We built a robot brain that nothing can stop. Shattered limbs? Jammed motors? If the bot can move, the Brain will move it— even if it’s an entirely new robot body. Meet the omni-bodied Skild Brain:
1
0
9
In nature, animals learn to perceive their environment before mastering complex locomotion. @SkildAI’s robot brain confirms this principle: with Pix2Loco, we achieved arguably the most robust locomotion skill to date. What comes next👀?
We’ve all seen humanoid robots doing backflips and dance routines for years. But if you ask them to climb a few stairs in the real world, they stumble! We took our robot on a walk around town to environments that it hadn’t seen before. Here’s how it works🧵⬇️
2
3
46
Thrilled to be part of this incredible journey with such an amazing team working toward AGI. Excited for what’s to come!
Modern AI is confined to the digital world. At Skild AI, we are building towards AGI for the real world, unconstrained by robot type or task — a single, omni-bodied brain. Today, we are sharing our journey, starting with early milestones, with more to come in the weeks ahead.
0
1
27
Three years ago, when we began exploring learning from video—most tasks were just pick-and-place. With PSAG, we enabled one-shot learning of deformable object manipulation from YouTube. Now, this paper pushes it further, tackling a wider range of tasks via visual FMs w.o. demos!
Research arc: ⏪ 2 yrs ago, we introduced VRB: learning from hours of human videos to cut down teleop (Gibson🙏) ▶️ Today, we explore a wilder path: robots deployed with no teleop, no human demos, no affordances. Just raw video generation magic 🙏 Day 1 of faculty life done! 😉
1
10
37
Great to see it launched! Congrats to @Vikashplus and team!
All forms of intelligence co-emerged with a body, except AI We're building a #future where AI evolves as your lifelike digital twin to assist your needs across health, sports, daily life, creativity, & beyond... https://t.co/QL3o9YxZYz ➡️ Preview your first #HumanEmbodiedAI
0
1
2
Tactile sensing is gaining traction, but slowly. Why? Because integration remains difficult. But what if adding touch sensors to your robot was as easy as hitting “print”? Introducing eFlesh: a 3D-printable, customizable tactile sensor. Shape it. Size it. Print it. 🧶👇
21
100
826
(8/n) We would also like to thank Jayesh (@SinglaJayesh) and Ananye (@anag004), the authors of SAPG, for their invaluable help throughout the paper.
0
1
3
Tired of tuning PPO or blaming it on reward, task design, etc.? Introducing EPO -- our second (and hopefully final :) attempt at fixing PPO at scale! Contrary to intuition, as the batch size or data increases, PPO saturates due to a lack of diversity in sampling. We proposed a
(1/n) Since its publication in 2017, PPO has essentially become synonymous with RL. Today, we are excited to provide you with a better alternative - EPO.
2
8
98
The paper's directional selection step is a genetic algorithm, but is only activated only when there is sufficient performance difference. Likewise, evolutionary theory states that selection can only act on existing variation within a population: if all agents perform
(3/n) Introducing Evolutionary Policy Optimization (EPO): a hybrid policy optimization method that integrates genetic algorithms. A master agent learns stably and efficiently from pooled experience across a population of agents.
2
3
25
Thank you for all your help throughout the paper! It all started with SAPG. I look forward to seeing more work on improving the scalability and efficiency of RL.
PPO is often frustrating to tune for many continuous control tasks since it keeps getting stuck in local minima. In our SAPG paper ( https://t.co/ZD4Ds1xOJC), we showed how training multiple followers with PPO and combining their data can mitigate this issue. In EPO,
0
0
9