fangchenliu_ Profile Banner
Fangchen Liu Profile
Fangchen Liu

@fangchenliu_

Followers
1K
Following
706
Media
16
Statuses
108

Ph.D. @Berkeley_AI, prev @PKU1898 @HaoSuLabUCSD

Berkeley, CA
Joined November 2022
Don't wanna be here? Send us removal request.
@fangchenliu_
Fangchen Liu
7 days
RT @qiayuanliao: Want to achieve extreme performance in motion tracking—and go beyond it? Our preprint tech report is now online, with open….
0
228
0
@fangchenliu_
Fangchen Liu
1 month
RT @letian_fu: @qiyang_li will present OTTER tomorrow at #ICML2025!.A lightweight, instruction-following VLA! See OG post below!.👉Code alre….
0
1
0
@grok
Grok
3 days
Join millions who have switched to Grok.
178
198
2K
@fangchenliu_
Fangchen Liu
1 month
RT @qiyang_li: Everyone knows action chunking is great for imitation learning. It turns out that we can extend its success to RL to better….
0
68
0
@fangchenliu_
Fangchen Liu
2 months
Join us to explore the frontier of humanoid agents at CVPR👇.
@walterzhu8
Wentao Zhu
2 months
Join us tomorrow for the 1st Workshop on Humanoid Agents! We have an exciting lineup: @xiaolonw @xavierpuigf @GuanyaShi @GerardPonsMoll1 @blacksquirrel__ @tianminshu @petitegeek @xbpeng4. 📍 Room 101 D, Music City Center.🔗 @CVPR #CVPR2025
Tweet media one
Tweet media two
0
1
10
@fangchenliu_
Fangchen Liu
2 months
RT @GuanyaShi: ✈️to #CVPR2025 to give three workshop/tutorial talks about learning humanoid whole-body control and loco-manipulation:. - We….
0
9
0
@fangchenliu_
Fangchen Liu
3 months
RT @younggyoseo: Excited to present FastTD3: a simple, fast, and capable off-policy RL algorithm for humanoid control -- with an open-sourc….
0
115
0
@fangchenliu_
Fangchen Liu
3 months
Ppl are collecting large-scale teleoperation datasets, which are often just kinematics-level trajectories. Real2Render2Real is a new framework that can generate these data w.o. teleoperation or tricky sim+rl. High data quality for BC + nice scaling effect, plz dive in for more!.
@letian_fu
Max Fu
3 months
Tired of teleoperating your robots?.We built a way to scale robot datasets without teleop, dynamic simulation, or even robot hardware. Just one smartphone scan + one human hand demo video → thousands of diverse robot trajectories. Trainable by diffusion policy and VLA models
0
4
54
@fangchenliu_
Fangchen Liu
4 months
RT @smithlaura1028: My goal throughout my PhD has been to take robots out of the lab and into the real world. It was so special to be a par….
0
56
0
@fangchenliu_
Fangchen Liu
5 months
RT @Agentica_: Introducing DeepCoder-14B-Preview - our fully open-sourced reasoning model reaching o1 and o3-mini level on coding and math.….
0
208
0
@fangchenliu_
Fangchen Liu
5 months
RT @walterzhu8: Join us at the 1st Workshop on Humanoid Agents @CVPR! #CVPR2025. Speakers in CV, CG, Robotics & CogSci will share insights….
0
12
0
@fangchenliu_
Fangchen Liu
5 months
RT @philippswu: New VLA work from @fangchenliu_ @RavenHuang4 @letian_fu and its all open source! Cool insights on how to better leverage pr….
0
2
0
@fangchenliu_
Fangchen Liu
5 months
RT @letian_fu: We had all the ingredients years ago—CLIP has been around since 2021! OTTER shows that combining these existing tools in the….
0
14
0
@fangchenliu_
Fangchen Liu
5 months
11/N It’s been an exciting collaboration between @berkeley_ai and @Meta! I had a fantastic time working with co-leaders Raven (@RavenHuang4) and Max (@letian_fu). Grateful for the invaluable insights from Tingfan Wu, @mukadammh, @JitendraMalikCV, @Ken_Goldberg, and @pabbeel!
Tweet media one
0
0
11
@fangchenliu_
Fangchen Liu
5 months
10/N We open-sourced the code, models, and datasets! Please refer to the paper for more technical details. More pre-trained models are also coming, please stay tuned!.Jax Code: Pytorch Code: Paper:
Tweet card summary image
arxiv.org
Vision-Language-Action (VLA) models aim to predict robotic actions based on visual observations and language instructions. Existing approaches require fine-tuning pre-trained visionlanguage models...
1
2
15
@fangchenliu_
Fangchen Liu
5 months
9/N For more manipulation primitives beyond pick-and-place (covering our full set of tasks), Otter demonstrates superior performance. It outperforms fine-tuned Octo, OpenVLA, and even Pi0, showcasing its strong generalization ability and robustness across diverse robotic tasks.
Tweet media one
1
1
6
@fangchenliu_
Fangchen Liu
5 months
8/N Otter performs well on physical robots for pick-and-place tasks (10 out of 19 tasks). It successfully generalizes to both training tasks with unseen configurations and distractors and held-out tasks. Notably, Otter outperforms Octo, OpenVLA, and the recently introduced Pi0.
Tweet media one
1
1
6
@fangchenliu_
Fangchen Liu
5 months
7/N (Cont) To further assess Otter’s zero-shot generalization capabilities for unseen tasks, we introduced new objects and out-of-distribution language instructions that were not present in the training dataset.
1
0
3
@fangchenliu_
Fangchen Liu
5 months
7/N We collected demonstrations using the DROID setup under challenging generalization settings. For every task, we randomized object locations and added 2-3 distractors. During evaluation, we use different object locations and distractors.
1
0
3
@fangchenliu_
Fangchen Liu
5 months
6/N OTTER combines ClearCLIP’s fine-grained visual features together with (1) robot proprioception for precise control and (2) language instructions for flexible handling of the same object—whether picking, pushing, or poking.
Tweet media one
1
0
7
@fangchenliu_
Fangchen Liu
5 months
5/N Building on these observations, we propose a simple yet effective approach: we measure the similarity between each language token and visual patch. Specifically, we use these similarity scores to “retrieve” the image patches that closely match the task instruction.
Tweet media one
1
0
8