
Vector Wang
@VectorWang2
Followers
1K
Following
453
Media
140
Statuses
306
PhD student in robotics manipulation, currently on physics-aware world models for robust manipulation @RiceCompSci. Designer and developer of XLeRobot.
Houston, TX
Joined November 2021
XLeRobot 0.3.0 Showcases Open fridge, get drinks, fill ice, wipe table, clean room, take care plants and cats... All for 660$, fully open-sourced, based on HF LeRobot. Teleop with Joy-con, or RL/VLA. Assembly kit ready for purchase soon Stay tuned! https://t.co/9hK0e8ufr4
9
50
297
Recent advances in 3D reconstruction with stereo and even monocular cameras make me believe we can achieve the lowest-cost (600$~3000$) practical mobile home robots in the near future (3-5 years), instead of 30~50$k humanoids.
Last night we successfully researched a path to get perfect point clouds on $10 stereo USB cameras at Bracket Bot. If you want to bring robots to billions of people, and work on state of the art robots, DM me
4
0
10
Why Open-source With Isaac sim, GR00T, SmolVLA, 660$ XLeRobot can organize stuff into drawer and zipping up bag in 2 days Made by MakerMods @IsaacSin12 @QILIU9203 @Ryan_Resolution on @seeedstudio @huggingface @NVIDIARobotics Home Robot Hackathon in Shenzhen (Bay Area next)
1
24
204
Finally see some decent research using SO101 as experiment platform!
What's the right architecture for a VLA? VLM + custom action heads (π₀)? VLM with special discrete action tokens (OpenVLA)? Custom design on top of the VLM (OpenVLA-OFT)? Or... VLM with ZERO modifications? Just predict action as text. The results will surprise you. VLA-0:
0
2
61
Exciting things at LeRobot! 🤖 We’ve just integrated Meta-World: a benchmark for testing multi-task and generalization abilities in robotic manipulation: https://t.co/LxxEQ4ysym We’ve also cleaned up our environments and standardized on Gymnasium ≥ 1.x.x and MuJoCo ≥ 3.0.0
1
20
158
This is actually a glimpse into intuitive physics.
🤖What if a robot could perform a new task just from a natural language command, with zero demonstrations? Our new work, NovaFlow, makes it possible! We use pre-trained video generative model to create a video of the task, then translate it into a plan for real-world robot
1
0
14
Believe me, though you can maybe build these with just an SO101 arm, these are harder than most of the home tasks in Vision & Language (and the physics behind) reasoning. A great way to quantitatively evaluate the "VL" ability in VLA models I would say.
If your robot can’t tell colors, how can it perceive the world? If it can’t stably stack blocks, how can it understand physics? If it can’t link language to space, how can it reason? If it can’t contextualize tasks, how can it be generalizable? Before claiming “general reasoning
2
5
38
🚀 All teams are ready for the upcoming #EmbodiedAIHackathon — a 2-day onsite build for home & cooking #robots! ✨ 👉 https://t.co/NiMI6TwHoY Project proposals include dual-arm kitchen assistants, boba-making bots, language-guided helpers, and even home doctors — all integrating
3
4
20
With proper motion control, this <200$ arm can already achieve similar steadiness to those 2k$ arms.
5
10
105
a little sneak peek into what we’ve been working on: XLeRobot. thanks to @QILIU9203 for building this. we can now remote control it to gather data as we train it to take over your most hated home tasks. future of home robotics shouldn’t be expensive humanoids but affordable
3
4
20
Enough Demos😤, it’s time to get manipulation benchmarks real for real! 🦾 ManipulationNet💪 quantifies the limits and boundaries of the current approaches in a solid way. Starting primitive (peg-in-hole, cubes inference) but challenging. Not visually appealing, but will work.
“You can’t make progress until you are able to measure it. Robotics still doesn’t have such a rallying call. No one agrees on anything.” I 💯 agree with the recent post from @DrJimFan. To break this impasse, we are excited to announce ManipulationNet ( https://t.co/DvoN2nvURq), a
1
1
27
XLeRobot trained with SmolVLA, shot in real-time. This makes me really looking forward to the hackathon on Oct 24-25. Made by Makermods QiLiu.
12
41
288
This is the correct discussion towards “world models”. World models should never be just video generation.
🎉 Excited to share our review article in Science Robotics ! 🤖 We survey a decade of efforts on learning environment transition functions from data, known variously as world models, intuitive physics, learned simulators, and more. These efforts span cognitive science, vision,
0
0
4
🚀 We've merged v3 of the LeRobot dataset format into main We can now scale robotics datasets 1000x bigger without compromising performance Huge congrats to the LeRobot team
8
36
162
To those who quote Alan Key’s words, robots are definitely not computers that don’t need any physical interaction with the world. It’s the physical interactions that truly matters. By changing rigid finger to soft, many grasping algs based on rigid assumption will become useless.
0
0
11
Pi-05 after just 30 minutes of fine-tuning. Feels like the best VLA I tried so far
We've added pi-05 to the openpi repo: pi05-base, pi05-droid, pi05-libero. Also added PyTorch training code!🔥 Instructions and code here: https://t.co/EOhNYfpq9B This is an updated version of the model we showed cleaning kitchens and bedrooms in April:
18
42
409