Ameesh Shah
@ameeshsh
Followers
477
Following
3K
Media
5
Statuses
154
Comp. Sci 🅿h.D @UCBerkeley, NDSEG fellow | working on machine learning and formal methods for trustworthy robots + AI 🫡 | Rice U Alum | he/him
CLE ➡️HTX ➡️Cal
Joined April 2012
Now we can have self-refinement for VLAs in the real world (with the aid of a big VLM)! VLM critiques VLA rollouts and iteratively refines the commands to make it perform better.
LLMs have shown a remarkable ability to “self-refine” and learn from their mistakes via in-context learning. But in robotics, most methods are single-shot. How can we bring inference-time adaptation to robot learning? A 🧵:
5
25
245
This was a really fun project to work on with great collaborators: @verityw_ , @adwait_godbole, Sanjit, and @svlevine. I'm excited to see more inference-time techniques develop for robot learning! arXiv: https://t.co/dVlXpOHDDX website:
arxiv.org
Solving complex real-world control tasks often takes multiple tries: if we fail at first, we reflect on what went wrong, and change our strategy accordingly to avoid making the same mistake. In...
0
2
22
As a result, LITEN’s VLM planner can iteratively learn both the VLA’s capabilities and relevant information about real-world dynamics to solve complex, long-horizon tasks.
1
1
7
To address this, LITEN uses a VLM-as-a-judge to evaluate (failed) previous attempts and generate useful takeaways to be included in-context for the VLM planner in future attempts.
1
0
10
We actually find that SoTA VLMs struggle with understanding physical affordances in general. In a task where the robot needs to grab objects out of bowls, GPT-5 initially plans to move items in small bowls the gripper can't fit into, instead of picking items from larger bowls.
1
0
9
However, a key issue arises when trying to use an off-the-shelf VLM as a planner. How is the VLM going to know what the VLA-controlled robot is capable of?
1
0
8
We present Learning from Inference Time Execution, or LITEN, which solves complex robotic tasks by treating a VLA as a low-level controller and pairing it with a high-level VLM planner that gives step-by-step instructions to the VLA.
1
0
9
Current robot foundation models, i.e. VLAs, are getting better at generalizing. But SoTA VLAs are limited in two key ways: (1) they can’t handle complex novel instructions and (2) they can’t learn from their mistakes and readjust behavior.
1
0
8
LLMs have shown a remarkable ability to “self-refine” and learn from their mistakes via in-context learning. But in robotics, most methods are single-shot. How can we bring inference-time adaptation to robot learning? A 🧵:
10
18
129
Everyone knows action chunking is great for imitation learning. It turns out that we can extend its success to RL to better leverage prior data for improved exploration and online sample efficiency! https://t.co/J5LdRRYbSH The recipe to achieve this is incredibly simple. 🧵 1/N
3
73
366
The submission deadline to our workshop has been extended until Feb. 7!!!! Submit your papers on trustworthiness + verification + genAI + ML and come hang with us in Singapore!
📣Announcing VerifAI: AI Verification in the Wild, a workshop at #ICLR2025 VerifAI will gather researchers to explore topics at the intersection of genAI/trustworthyML and verification: https://t.co/3BIMRup0G7
@celine_ylee @theo_olausson @ameeshsh @wellecks @taoyds
0
4
6
The ultimate test of any physics simulator is its ability to deliver real-world results. With MuJoCo Playground, we’ve combined the very best: MuJoCo’s rich and thriving ecosystem, massively parallel GPU-accelerated simulation, and real-world results across a diverse range of
37
187
909
📣Announcing VerifAI: AI Verification in the Wild, a workshop at #ICLR2025 VerifAI will gather researchers to explore topics at the intersection of genAI/trustworthyML and verification: https://t.co/3BIMRup0G7
@celine_ylee @theo_olausson @ameeshsh @wellecks @taoyds
0
22
84
📢📢 Thrilled to share our new work: Syzygy: Dual Code-Test C to Rust Translation using LLMs and Dynamic Analysis! Key Principles 1️⃣ 🤖🤝🔍 Combining LLM Inference Scaling with Dynamic Analysis: * Best of both worlds (program semantics and neural search) 2️⃣ ⚡️🧪 Dual Code-Test
1
22
71
Everything you love about generative models — now powered by real physics! Announcing the Genesis project — after a 24-month large-scale research collaboration involving over 20 research labs — a generative physics engine able to generate 4D dynamical worlds powered by a physics
569
3K
16K
📢 Announcing NeuS 2025 - 2nd International Conference on Neuro-symbolic Systems! 📢 Join us May 27-30, 2025, at the University of Pennsylvania, Philadelphia. More info: https://t.co/PbaUcQhCsS
#NeuS2025
0
5
10
LLMs excel at fitting finetuning data, but are they learning to reason or just parroting🦜? We found a way to probe a model's learning process to reveal *how* each example is learned. This lets us predict model generalization using only training data, amongst other insights: 🧵
20
124
764
Excited to share our work on STEERing robot behavior! With structured language annotation of offline data, STEER exposes fundamental manipulation skills that can be modulated and combined to enable zero-shot adaptation to new situations and tasks.
1
19
107
Applying to grad school this fall? The Equal Access to Application Assistance (EAAA) program for @Berkeley_EECS is now accepting applications! Any PhD applicant to @Berkeley_EECS can submit their application for feedback by Oct. 6th 11:59PM PST. EAAA is a student-led program
sites.google.com
PURPOSE
2
9
33
Anyone considering applying to EE/Computer Science PhD programs - We run a program at Berkeley where your application gets reviewed by a current EECS PhD student! Check the link below!! And if you know any PhD hopefuls, please share for visibility!!
Hi all prospective grad students! Our Equal Access to Application Assistance (EAAA) program for @Berkeley_EECS is now accepting applications! Any PhD applicant to @Berkeley_EECS can submit their application for feedback by Oct. 6th 11:59PM PST https://t.co/7Usz6kuLIf
0
2
11