
Embodied AI Reading Notes
@EmbodiedAIRead
Followers
558
Following
3
Media
49
Statuses
57
Sharing daily personal notes on selected interesting Embodied AI papers, blogs and talks | Maintained by @yilun_chen_ | Opinions are my own.
California, USA
Joined July 2025
Manipulation as in Simulation: Enabling Accurate Geometry Perception in Robots. Project: Paper: Code: Tutorial: New open-source Camera Depth Models (CDM) for depth cameras to enable
2
15
74
EmbodiedOneVision: Interleaved Vision-Text-Action Pretraining for General Robot Control. Project: Paper: Code: New open-source unified 3B embodied foundation model that enables perception, planning,
3
34
201
FastTD3: Simple, Fast, and Capable Reinforcement Learning for Humanoid Control. Project: Paper: Code: A high performant variant of TD3 algorithm that’s optimized for humanoid tasks from Pieter Abbeel’s
2
27
150
HITTER: A HumanoId Table TEnnis Robot via Hierarchical Planning and Learning. Project: Paper: This project shows amazing videos of Unitree G1 playing table tennis against human with agile and fluent motion for 106 consecutive
1
0
14
RICL: Adding In-Context Adaptability to Pre-Trained Vision-Language-Action Models. Project: Paper: This project adapts the popular in-context learning (ICL) and retrieval-augmented generation (RAG) ideas from LLM, and successfully
0
15
99
Advancing the Frontier of Silicon Intelligence: the Past, Open Problems, and the Future. YouTube: Recent lecture given by Shuchao Bi, lead researcher at OpenAI, now Meta SuperIntelligence at Columbia University. Having a unique experience of math PhD.
0
0
2
Neural Robot Dynamics. Project: Paper: An interesting new work on learning robot-specific dynamics models for predicting future states on robot as articulated rigid body, which can serve as replacement for low-level dynamics and
1
16
113
Masquerade: Learning from In-the-wild Human Videos using Data-Editing. Project: Paper: A simple yet effective way to improve pre-training robot policies from egocentric human videos: edit video input by replacing human with robot
0
12
77
Large Behavior Models and Atlas Find New Footing. Blog: Amazing new loco-manipulation demos on Atlas from Boston Dynamics and Toyota TRI collaboration. The blog shows some interesting videos covering a wide range of new capabilities, featured by one
2
25
141
Video Generators are Robot Policies. Project: Paper: Video is an abundant and scalable data source for robot learning, but it’s hard to use as it lacks action information in it. This project proposes a clever way to leverage
1
21
142
BeyondMimic: From Motion Tracking to Versatile Humanoid Control via Guided Diffusion. Project: Paper: This project shows videos on amazing agile and versatile humanoid whole-body motion surpassing prior works, so it’s worthwhile to
3
21
163
DinoV3. Website: Code: Models: Paper: DinoV3 is released publicly today! Nice performance upgrade as shown below. Major updates from DinoV2:.- DinoV2 1.1B Parameters -> DinoV3 7B
0
2
13
Understanding Multimodal LLMs. Blog: A nice blog covering the fundamentals of multimodal LLMs. It’s important to understand how multimodal LLM works, as it is highly related and often serves as a major building block in modern embodied ai model
0
0
7
MolmoAct: Action Reasoning Models that can Reason in Space. Blog: Paper: New paper from Ai2 on a new kind of robot foundation model, Action Reasoning Model as they named it. - The new architecture extends a typical VLM model by
0
1
13
Evaluating Pi0 in the Wild: Strengths, Problems, and the Future of Generalist Robot Policies. Blog: An interesting “vibe-checking” on Pi0 policy in a kitchen-like environment. It’s interesting to see examples on what works well and what doesn’t work out
0
5
22
The unique difference here is they, at scale, used a video prediction model (or called world model) to replace the VLM often used in recent VLAs to serve as the pre-trained robot foundation model backbone. One could argue this shifts towards more "vision-centric" rather than.
0
0
4
Genie Envisioner: A Unified World Foundation Platform for Robotic Manipulation. Project: Paper: New paper from AgiBot on World Model for manipulation. - GE-Base: a video diffusion model trained on 3000hours, over 1million
9
52
308
Real-Time Execution of Action Chunking Flow Policies. Paper: A followup blog from Cobot to further investigate the hyper-parameters used in RTC: The latest research paper from Physical Intelligence trying to tackle the
0
1
9
The Emerging Humanoid Motor Cortex: An Inventory of RL-Trained Controllers. Blog: Whole Body Controllers comparison table: A great blog by Alan Fern reviewing the current field of WBC (whole-body controller) for humanoids. Highly
0
0
8