
Chelsea Finn
@chelseabfinn
Followers
80K
Following
1K
Media
255
Statuses
606
Asst Prof of CS & EE @Stanford Co-founder of Physical Intelligence @physical_int PhD from @Berkeley_EECS, EECS BS from @MIT
Palo Alto, CA
Joined June 2014
The robot can autonomously perform a real gallbladder removal subroutine!. - Successfully completed the procedure on all 8 of 8 held-out gallbladders.- Uses same alg that we used to train robots to make trail mix, using language hierarchy. Paper + videos:
Introducing Hierarchical Surgical Robot Transformer (SRT-H), a language-guided policy for autonomous surgery🤖🏥. On the da Vinci robot, we perform a real surgical procedure on animal tissue. Collaboration b/w @JohnsHopkins & @Stanford
4
20
162
We still lack a scalable recipe for RL post-training seeded with demonstration data. Many methods add an imitation loss, but this constrains learning too much. We propose to use the demos only to perturb exploration -- It works really well!. Paper:
RL often struggles with poor sample efficiency, even with expert data. How can we address this?. One approach is to incorporate an imitation loss, but that can overconstrain the policy. We propose leveraging prior data implicitly to guide more effective exploration. (1/5).
3
37
296
How can robots problem solve in novel environments?. We combine high-level reasoning with VLMs with low-level controllers to allow test-time problem solving. Paper & code:
How can robots autonomously handle ambiguous situations that require commonsense reasoning?. *VLM-PC* provides adaptive high-level planning, so robots can get unstuck by exploring multiple strategies. Paper:
7
19
148
How do we make a scalable RL recipe for robots?. We study batch online RL w/ demos. Key findings:.- iterative filtered imitation is insufficient.- need diverse policy data, eg using diffusion policy.- policy extraction can hinder data diversity. Paper:
Robotic models are advancing rapidly—but how do we scale their improvement? 🤖. We propose a recipe for batch online RL (train offline with online rollouts) that enables policies to self-improve without complications of online RL. More: (1/8)
3
26
172
Most robot policies don't have any memory!. This is because:.- policies often perform *worse* with past observations as input.- GPU memory and compute constraints. We address both to train long-context robot diffusion policies. 🤖. Paper & code:
Giving history to our robot policies is crucial to solve a variety of daily tasks. However, diffusion policies get worse when adding history. 🤖. In our recent work we learn how adding an auxiliary loss that we name Past-Token Prediction (PTP) together with cached embeddings
4
25
235
RT @amberxie_: Introducing ✨Latent Diffusion Planning✨ (LDP)! We explore how to use expert, suboptimal, & action-free data. To do so, we le….
0
44
0
RT @Anikait_Singh_: I’m in Singapore for #ICLR2025!. Excited to present Improving Test-Time Search for LLMs with Backtracking Against In-Co….
0
9
0
I'm giving two talks today/Sunday at #ICLR2025 !. - Post-Training Robot Foundation Models (Robot Learning Workshop @ 12:50 pm). - Robot Foundation Models with Open-Ended Generalization (Foundation Models in the Wild @ 2:30 pm). Will cover π-0, Demo-SCORE, Hi Robot, & π-0.5.
8
10
246
RT @SurajNair_1: Since the first year of my PhD, every talk I’ve given has opened with a slide about the distant north star: dropping a rob….
0
5
0
RT @smithlaura1028: My goal throughout my PhD has been to take robots out of the lab and into the real world. It was so special to be a par….
0
56
0
Key idea: When training on all data, policy success is indicative of whether the strategy it took is good!. Paper: Led by @_anniechen_ and @AlecLessing, with @liu_yuejiang @StanfordAILab.
0
0
12