Xiao Ma Profile
Xiao Ma

@yusufma555

Followers
1K
Following
287
Media
22
Statuses
93

Staff Research Scientist @ ByteDance Seed, working on robot foundation models. Prev: @Dyson @NUSingapore @sjtu1896. All views are my own.

Singapore & Beijing
Joined August 2015
Don't wanna be here? Send us removal request.
@yusufma555
Xiao Ma
8 days
I've been working on deformable object manipulation since my PhD. It was totally a nightmare years ago and my PhD advisor was telling me not to work on it for my own good. Today, at ByteDance Seed, we are dropping GR-RL, a new VLA+RL system that manages long-horizon precise
34
137
904
@yusufma555
Xiao Ma
8 days
Thanks @_akhaliq again for sharing our work! For the full demo video, please find it in our thread:
@_akhaliq
AK
8 days
GR-RL Going Dexterous and Precise for Long-Horizon Robotic Manipulation
0
0
1
@yusufma555
Xiao Ma
8 days
GR-RL proves something: IL is inherently limited, and we can do things previously thought impossible by purely visuo-motor control simply making it RL. The future direction is clear: distill RL-enhanced behavior back into the foundation VLA, forming a self-improving,
1
2
21
@yusufma555
Xiao Ma
8 days
The Result: Final performance: 83.3% success over continuous shoelace threading. The surprising part? GR-RL learns to: ๐Ÿ”ฅ retry when the lace slips ๐Ÿ”ฅ reposition the lace when the initial pose is bad ๐Ÿ”ฅ "self-correct" mid-task instead of freezing This is the behavior you
1
1
27
@yusufma555
Xiao Ma
8 days
Online Stage โ€” Real-World Steering RL Now the robot learns ON THE PHYSICAL PLATFORM. But direct exploration in joint space causes dangerous jitter and is inefficient โ€” you need millimeter accuracy. So GR-RL explores in the latent noise space: A tiny 51.5M-param noise-predictor
1
1
34
@yusufma555
Xiao Ma
8 days
Morphological Symmetry Augmentation Our bi-manual robot is leftโ€“right symmetric. So we mirror EVERYTHING: ๐Ÿš€ RGB ๐Ÿš€ Proprioception ๐Ÿš€ Actions ๐Ÿš€ Language Instructions Data size doubles. Spatial reasoning robustness skyrockets. โ†’ 72.7% success.
1
0
20
@yusufma555
Xiao Ma
8 days
Offline Stage โ€” Filter the human flaws We train a Critic Transformer via distributional RL: โญ๏ธ Detects โ€œvalue dropsโ€ when the operator hesitates or messes up โญ๏ธ Slices every trajectory into high-value vs low-value segments โญ๏ธ Retains only the cleanest expert behavior Effect:
2
1
23
@yusufma555
Xiao Ma
8 days
The Idea: If imitation is broken, then: Let the robot learn from its own experience. GR-RL = โญ๏ธ Offline RL (data filtering) โญ๏ธ Symmetry augmentation โญ๏ธ Online closed-loop Real-World Reinforcement Learning All on top of a single VLA foundation model. Just RGB, proprioception,
1
2
31
@yusufma555
Xiao Ma
8 days
Two killers of imitation learning (IL): (1) Human demos are NOT optimal Humans hesitate, retry, fix mistakes mid-trajectory. IL blindly copies ALL of it โ€” including the bad parts. (2) Training vs Deployment Misalignment VLA models output actions. To prevent jitter, robots
1
0
28
@yusufma555
Xiao Ma
8 days
Why โ€œshoelace threadingโ€ matters ๐Ÿค” This task is probably one of the most challenging household robotics tasks in terms of precision: ๐Ÿ’ฅ Soft-body chaos โ€“ laces deform every frame ๐Ÿ’ฅ Millimeter precision โ€“ 1โ€“2 mm slip = total failure ๐Ÿ’ฅ Long-horizon manipulation โ€“ hundreds of
1
0
41
@yusufma555
Xiao Ma
12 days
Well done @stepjamUK and @Neuracore_AI ! Playing with robot learning is always frustrating to start with, especially the infra. It creates an invisible barrier for people even to enter this field. In the era where everyone is chasing to build their next generalist robot,
@Neuracore_AI
Neuracore
12 days
Today we are excited to open up Neuracore to the academic community! Neuracore is a new data foundation built to accelerate robot learning by removing one of the fieldโ€™s biggest bottlenecks: capturing and working with high-fidelity multimodal robotics data. For the first time,
0
0
5
@yusufma555
Xiao Ma
2 months
Congrats! @stepjamUK It has been a great time working with Stephen. Looking forward to the exciting research coming up! Do not hesitate to apply if you're currently seeking for PhD opportunities!
@stepjamUK
Stephen James
3 months
As a newly appointed ๐—”๐˜€๐˜€๐—ถ๐˜€๐˜๐—ฎ๐—ป๐˜ ๐—ฃ๐—ฟ๐—ผ๐—ณ๐—ฒ๐˜€๐˜€๐—ผ๐—ฟ at @imperialcollege, I'm thrilled to announce the ๐—ฆ๐—ฎ๐—ณ๐—ฒ ๐—ช๐—ต๐—ผ๐—น๐—ฒ-๐—ฏ๐—ผ๐—ฑ๐˜† ๐—œ๐—ป๐˜๐—ฒ๐—น๐—น๐—ถ๐—ด๐—ฒ๐—ป๐˜ ๐—ฅ๐—ผ๐—ฏ๐—ผ๐˜๐—ถ๐—ฐ๐˜€ ๐—Ÿ๐—ฎ๐—ฏ (๐—ฆ๐—ช๐—œ๐—ฅ๐—Ÿ) at ๐—œ๐—บ๐—ฝ๐—ฒ๐—ฟ๐—ถ๐—ฎ๐—น ๐—–๐—ผ๐—น๐—น๐—ฒ๐—ด๐—ฒ ๐—Ÿ๐—ผ๐—ป๐—ฑ๐—ผ๐—ป. ๐—ฆ๐—ฎ๐—ณ๐—ฒ ๐—ช๐—ต๐—ผ๐—น๐—ฒ-๐—ฏ๐—ผ๐—ฑ๐˜†
2
1
5
@yusufma555
Xiao Ma
3 months
ByteWrist shows how compact parallel wrists can bring robotic manipulation closer to human-level dexterity in tight spaces. ๐Ÿ“„ Read the paper: https://t.co/QZBiytM7Xp ๐ŸŒ Project page:
bytewrist.github.io
Simple project page template for your research paper, built with Astro and Tailwind CSS
0
0
0
@yusufma555
Xiao Ma
3 months
Results: 1. Higher integration & flexibility vs. Kinova systems 2. Stable rollโ€“pitchโ€“yaw (RPY) control 3. 116 hours of autonomous data collection for dexterous manipulation tasks
1
0
0
@yusufma555
Xiao Ma
3 months
We built a 22-DoF dual-arm robot, ByteMini, powered by ByteWrist. It can: โœ… Maneuver in narrow glove-box spaces โœ… Grasp objects faster than Kinova wrists (โ‰ˆ2ร— speedup) โœ… Perform dual-arm deformable object manipulation (e.g. clothes hanging!)
1
0
0
@yusufma555
Xiao Ma
3 months
โœจ Key innovations: 1. Nested 3-stage parallel drive โ†’ compact + multi-DOF control. 2. Arc-shaped end linkages โ†’ optimized force transmission + wider range. 3. Central supporting ball joint โ†’ stiffness without sacrificing flexibility.
1
0
0
@yusufma555
Xiao Ma
3 months
๐Ÿค– Traditional serial wrists = bulky + error-prone in clutter. โš™๏ธ Existing parallel wrists = stiff but not compact enough. Neither works well in tight, human-like environments. ByteWrist solves this.
1
0
0
@yusufma555
Xiao Ma
3 months
๐Ÿš€ New paper alert! We introduce ByteWrist โ€” a compact, anthropomorphic robotic wrist that enables dexterous manipulation in confined spaces. Think home service, medical robots, or precision assembly. ๐Ÿ‘‰ https://t.co/WxBCeAMmpM #ByteDanceSeed #EmbodiedAI #Robotics
7
54
315
@yusufma555
Xiao Ma
3 months
Lastly, we are still hiring! Our team is mainly based in Beijing and Singapore. DM me if you are interested!
0
0
0
@yusufma555
Xiao Ma
3 months
๐Ÿค– Flow-Based Policy for Online Reinforcement Learning Problem: Standard RL policies often struggle to model complex, multi-modal action spaces. Our solution: We introduce FlowRL, a new framework that uses flow-based generative models to create highly expressive policies. By
1
0
0