robotgradient Profile Banner
Julen Urain Profile
Julen Urain

@robotgradient

Followers
1K
Following
2K
Media
50
Statuses
272

Robotics Tinkerer. RS@FAIR (Embodied AI) Prev: @DFKI, @TUDarmstadt, @NvidiaAI. https://t.co/RQpq7Prbln X https://t.co/umZQeDjJv4

Joined November 2017
Don't wanna be here? Send us removal request.
@robotgradient
Julen Urain
26 days
This was very challenging and very cool to see evolve! I personally was no sure if it would work, but @irmakkguzey pushed so hard to show it does. Learning dexterous robot policies with only human video data, using the egocentric view from Aria2 glasses, chill and easy 😁
@irmakkguzey
Irmak Guzey
26 days
Dexterous manipulation by directly observing humans - a dream in AI for decades - is hard due to visual and embodiment gaps. With simple yet powerful hardware - Aria 2 glasses 👓 - and our new work AINA 🪞, we are now one significant step closer to achieving this dream.
0
0
7
@bingyikang
Bingyi Kang
1 month
After a year of team work, we're thrilled to introduce Depth Anything 3 (DA3)! 🚀 Aiming for human-like spatial perception, DA3 extends monocular depth estimation to any-view scenarios, including single images, multi-view images, and video. In pursuit of minimal modeling, DA3
80
504
4K
@robotgradient
Julen Urain
2 months
The expert mode is going to bring a lot of news in the future 🙃
@1x_tech
1X
2 months
NEO The Home Robot Order Today
0
0
6
@maxseitzer
Max Seitzer
4 months
Introducing DINOv3 🦕🦕🦕 A SotA-enabling vision foundation model, trained with pure self-supervised learning (SSL) at scale. High quality dense features, combining unprecedented semantic and geometric scene understanding. Three reasons why this matters…
12
140
1K
@robotgradient
Julen Urain
7 months
Anyone interested in tactile sensing for robotics should be following Akash's solid releases. How should we integrate rich tactile sensing modality for policy learning?
@akashshrm02
Akash Sharma
7 months
Robots need touch for human-like hands to reach the goal of general manipulation. However, approaches today don’t use tactile sensing or use specific architectures per tactile task. Can 1 model improve many tactile tasks? 🌟Introducing Sparsh-skin: https://t.co/DgTq9OPMap 1/6
1
3
11
@mukadammh
Mustafa Mukadam
1 year
Touch perception holds the key to unlock robot dexterity Our new @SciRobotics work shows how to fuse tactile & vision, track pose+shape of novel objects during dexterous manipulation https://t.co/TNlaBjB6Ra It's a culmination of our work over the last 4 years, see @Suddhus 🧵⬇
@Suddhus
Sudharshan Suresh
1 year
For robot dexterity, a missing piece is general, robust perception. Our new @SciRobotics work combines multimodal sensing with neural representations to perceive novel objects in-hand. 🎲 Featured on the cover of the November issue! #ScienceRoboticsResearch 🧵1/9
1
12
81
@robotgradient
Julen Urain
1 year
On the other side, SE(3) Flow Matching is a prefered option to SE(3) Diffusion. It is deterministic and its build on straight paths over the geodesics, leading to a much simpler generation/optimization paths.
1
0
0
@robotgradient
Julen Urain
1 year
In our intepretation, Invariant networks should be preferred over Equivariant networks as long as you can achieve same results (i.e. equivariant action generation). Imposing Invariance over your network is simpler and allows using most of the common functions (ReLU, Linear...)
1
0
0
@robotgradient
Julen Urain
1 year
When we were working in SE(3)-DiffusionFields, we were not aware how far this could be extended. Our new work shows that: - SE(3) Flow Matching is a simple yet powerful alternative to SE(3) Diffusion for robotics. - We can use Invariant network for equivariant action generation.
@n_w_funk
Niklas Funk
1 year
I am excited to share our recent work on "ActionFlow: Equivariant, Accurate, and Efficient Policies with Spatially Symmetric Flow Matching". The work presents a novel policy class combining Flow Matching with SE(3) Invariant Transformers for fast, equivariant, and expressive
1
9
57
@SnehalJauhri
Snehal Jauhri
1 year
Perfect start to the #CoRL2024 week! Was a pleasure organizing the NextGen Robot Learning Symposium at @TUDarmstadt with @firasalhafez @GeorgiaChal Thanks to the speakers for the great talks! @YunzhuLiYZ @NimaFazeli7 @Dian_Wang_ @HaojieHuang13 @Vikashplus @ehsanik @Oliver_Kroemer
1
7
61
@robotgradient
Julen Urain
1 year
Okey, not super impressive! BUT, this plot gives us hope! we found that the agent keeps improving on unseen songs with more and more demonstrations, so we are hopeful that given the large datasets available on the Internet, our agent will keep improving in the future 🎹
0
0
6
@robotgradient
Julen Urain
1 year
TASK 2: take all these RL policies and distill them (BC) into a Diffusion Policy. We use a hierarchical policy. The top layer policy outputs desired fingertip motions while the bottom layer policy generates the configuration space actions. Performance in unseen songs 👇
2
0
8
@robotgradient
Julen Urain
1 year
We train individual RL single-song policies, which we leverage to generate the desired observation-action pairs! Observe that using fingertip demonstrations leads to human-like motions, while the absence of them leads to policies behaving in unexpected styles.
1
0
8
@robotgradient
Julen Urain
1 year
We first extract both the FINGERTIP motion trajectories and a TASK trajectory (the song), which informs the task to be solved, from the videos. Then, we apply residual RL, using the fingertip motion as the nominal behaviour and the task trajectory as reward
1
0
5
@robotgradient
Julen Urain
1 year
So, what is the challenge? Unlike teleoperation data, video data lacks the action info needed to determine what control signals should be applied to the robot to accomplish the observed task. TASK 1: Infer the actions the robot should do to match the videos. How? RL + IL
1
0
5
@robotgradient
Julen Urain
1 year
@ChengQian0112 , with the collaboration of @kevin_zakka and @Jan_R_Peters introduces a simple framework to learn highly dexterous manipulation skills from videos. Given the large amount of video data, we can learn a generalist policy that can play ANY song.
1
0
5
@robotgradient
Julen Urain
1 year
YouTube is a LARGE dataset of demonstration videos to train Generalist robot agents, but lacks action data. How can we learn DEXTEROUS skills from them? In #CoRL2024, we explore the problem of learning a Generalist Piano Playing agent from YouTube videos. https://t.co/nRRy3hdqkL
6
43
316
@robotgradient
Julen Urain
1 year
Generalization is an essential property to make robots perform well in novel out-of-distribution contexts. How can we improve the generalization of our models? We explore strategies such as Composition, Extracting meaningful features, or observations and actions grounding.
0
1
4
@robotgradient
Julen Urain
1 year
Should you generate configuration space actions or task space actions? Should you generate trajectories or keyposes? Is it better to generate position actions or velocity actions? We explore different action representation modalities and highlight when to use each.
1
0
4
@robotgradient
Julen Urain
1 year
In terms, of generative models, we explore Diffusion Models, Energy-Based Models, Action Values Maps or GPT-Style models to name a few. We compare the benefits and pitfalls of each generative model and suggest situations in which each model would provide the best.
1
0
3