
Prior @ AI2
@Ai2Prior
Followers
195
Following
190
Media
0
Statuses
47
Tackling the boldest computer vision problems @allen_ai
Seattle, WA
Joined November 2024
RT @IlirAliu_: First fully open Action Reasoning Model (ARM); can āthinkā in 3D & turn your instructions into real-world actions:. [š Bookmā¦.
0
75
0
RT @RanjayKrishna: Most AI models still think in words. People, without even noticing, think with our bodies, planning how to move, grasp,ā¦.
0
12
0
RT @chris_j_paxton: This to me really feels like how robot foundation models "should" work. i like that it can autoregressively predict depā¦.
0
17
0
RT @DJiafei: Itās incredible to have both your advisors at the same company! With @fox_dieter17849 building the Robotics team, and @RanjayKā¦.
0
6
0
RT @YiruHelenWang: šØTired of binary pass/fail metrics that miss the bigger picture?. š¤Introducing #RoboEval ā an open benchmark that showsā¦.
0
35
0
RT @RenZhongzheng: š„³ Excited to share that Iāll be joining the CS Department at UNC-Chapel Hill (@unccs @unc_ai_group) as an Assistant Profā¦.
0
14
0
RT @anikembhavi: Our Molmo work won Best Paper Honorable mention at #CVPR2025 !.This large project was one of my best experiences with a faā¦.
0
8
0
RT @RanjayKrishna: I am doing something silly by testing whether I can remember and deliver multiple talks on the same day on different sliā¦.
0
8
0
Building on our work with Molmo, weāre excited to introduce GraspMolmo ā a vision-language model that predicts semantically meaningful grasps conditioned on natural language. A fantastic effort led by our PYI, @ab_deshpande !.
How should a robot hold a water bottle? š¤ That depends: is it opening it, or passing it to you?. Iām excited to introduce GraspMolmo, a VLM that predicts semantically appropriate grasps based on your command!. Website: š§µ Thread ā
0
1
8
Let us know how good is Molmo is at language guided pointing š Vote hereš.
Point-Battle is now live! .Vote or Submit your multimodal model and see how it stacks up in language-guided pointing and grounded visual reasoningālet the community decide which MLLM really hits the mark. We will also open-source all data for training MLLMs for pointing later on.
0
0
1
Great to see Molmo leading on pointingš.
š Pointing is our first ālanguageāābabies master it before words. Precise spatial grounding powers robotics, assistive tech, HCI, and vision-language interfaces. š¤ But can today's MLLMs point with pixel-level accuracy and truly ground visual reasoning?š·We introduce PointArena,
0
2
4