Jensen Gao Profile
Jensen Gao

@jensen_gao

Followers
145
Following
10
Media
12
Statuses
17

CS PhD Student @StanfordAILab | Previously BS/MS @Berkeley_EECS

Stanford, CA
Joined April 2022
Don't wanna be here? Send us removal request.
@jensen_gao
Jensen Gao
2 days
RT @_abraranwar: Are current eval/deployment practices enough for today’s robot policies?. Announcing the Eval&Deploy workshop at CoRL 202….
0
13
0
@jensen_gao
Jensen Gao
1 year
We also show that with composition, we can transfer policies to entirely new settings (kitchens) with unseen combinations of many environmental factors. Policies trained on data collected without environmental variation, or without prior robot data, fail to transfer well. (7/8)
1
1
4
@jensen_gao
Jensen Gao
1 year
In our real experiments, we find that policies often do achieve composition when trained on data from our strategies. Importantly, we find that using large prior robot datasets (in our case, BridgeData V2: is critical for strengthening this. (6/8)
1
0
3
@jensen_gao
Jensen Gao
1 year
If policies can compose these factors, we can exploit this during data collection. We propose strategies (Diagonal, L, Stair) that collect data (green) while prioritizing covering individual factors, such that composition could address their unseen combinations (pink). (5/8)
Tweet media one
1
0
1
@jensen_gao
Jensen Gao
1 year
For example, consider the task of putting a fork in a container. Even for this relatively simple task, there can be considerable variation along multiple axes, such as the type of fork, or the table height. We conduct real experiments where we extensively vary such factors. (4/8)
Tweet media one
1
0
2
@jensen_gao
Jensen Gao
1 year
However, training robot policies that can handle a wide variety of settings remains difficult. Data-driven methods have promise to scale and deal with this variation, but collecting robot data covering all desired combinations of environmental factors is often infeasible. (3/8).
1
0
2
@jensen_gao
Jensen Gao
1 year
Robotic tasks can vary in many ways, and handling them all can be challenging. Recent works (e.g., OXE: have scaled robot datasets to cover a diverse variety of environmental factors, such as those studied in the COLOSSEUM: (2/8).
robotics-transformer-x.github.io
Project page for Open X-Embodiment: Robotic Learning Datasets and RT-X Models
1
0
5
@jensen_gao
Jensen Gao
1 year
How should we efficiently collect robot data for generalization? We propose data collection procedures guided by the abilities of policies to compose environmental factors in their data. Policies trained with data from our procedures can transfer to entirely new settings. (1/8)
1
14
66
@jensen_gao
Jensen Gao
2 years
Thanks to amazing collaborators: @bidiptas13 @xf1280 @xiao_ted @jiajunwu_cs @brian_ichter @Majumdar_Ani @DorsaSadigh.Paper: Website: Talk:
drive.google.com
0
1
8
@jensen_gao
Jensen Gao
2 years
Finally, we evaluate our planner on real robot scenes, where we again find that PG-InstructBLIP significantly improves performance, succeeding on 9/10 tasks compared to 4/10 for the base VLM. (7/8)
1
1
7
@jensen_gao
Jensen Gao
2 years
To show the benefits of improved physical reasoning for robotics, we incorporate our VLM into an LLM-based planner and evaluate on 51 tasks across 8 diverse real world scenes. Using PG-InstructBLIP instead of the base VLM improves planning accuracy from 56.9% to 88.2%. (6/8)
1
0
7
@jensen_gao
Jensen Gao
2 years
We use PhysObjects to fine-tune InstructBLIP, a SOTA open-source VLM, to create PG-InstructBLIP. This significantly improves the VLM on our dataset, and slightly outperforms single concept fine-tuning, suggesting some transfer benefits from multi-concept VLM fine-tuning. (5/8)
Tweet media one
1
0
6
@jensen_gao
Jensen Gao
2 years
We use image data from EgoObjects: which consists of frames from egocentric video of objects from a wide variety of real household settings. We believe this makes PhysObjects particularly relevant for household robotics applications. (4/8)
Tweet media one
1
1
6
@jensen_gao
Jensen Gao
2 years
To address this limitation, we propose PhysObjects, a dataset intended for fine-tuning VLMs to better capture object-centric physical reasoning. It consists of annotations for 8 different continuous and categorical physical concepts that are broadly relevant for robotics. (3/8)
Tweet media one
1
0
6
@jensen_gao
Jensen Gao
2 years
VLMs are rapidly improving and have shown promise for grounding in robotics, but aren’t great at obtaining detailed info about objects in a scene. For example, when asking them about the material and contents of different cups, they sometimes work, but are often wrong. (2/8)
Tweet media one
1
1
7
@jensen_gao
Jensen Gao
2 years
Excited to release PhysObjects: a dataset of 36.9K human and 417K automatic physical concept annotations for images of common household objects. We use it to fine-tune VLMs to improve their physical reasoning, and leverage this for better robotics. (1/8)
2
16
101