Davidromogr Profile Banner
David Romero Profile
David Romero

@Davidromogr

Followers
159
Following
17K
Media
29
Statuses
121

🇪🇨 Engineer - PhD Student @MBZUAI Computer Vision

Abu Dabi
Joined August 2017
Don't wanna be here? Send us removal request.
@Davidromogr
David Romero
1 month
Check out more results and our full paper: "Learning to Generate Object Interactions with Physics-Guided Video Diffusion" 🌐 https://t.co/cDxNlHuS1M 📜 https://t.co/cP2gkZxmpn Work Done with: Ariana Bermudez, @HaoLi81, Fabio Pizzati and Ivan Laptev !
Tweet card summary image
arxiv.org
Recent models for video generation have achieved remarkable progress and are now deployed in film, social media production, and advertising. Beyond their creative potential, such models also hold...
0
1
3
@Davidromogr
David Romero
1 month
• KineMask can also perform simple low-level motion 💨 control, moving different objects on different degrees of freedom and speeds.
1
0
0
@Davidromogr
David Romero
1 month
• Emergence of Causality 🔥: KineMask trained on object interactions and tested with three different velocities on a real scene. As velocity increases, the resulting interactions also change, indicating that the model captures the causal structure of motion, a valuable property
1
0
0
@Davidromogr
David Romero
1 month
• Unlike other works, KineMask can generate realistic interactions with other objects, showing a understanding of rigid body dynamics. • We also show complex interactions that require implicit 3D understanding, making a pot or a glass of juice fall and crash as a result of
1
0
0
@Davidromogr
David Romero
1 month
• We train a controlnet using a low-level control signal, encoding the velocity as a mask🎭, and providing a high-level textual control extracted by a prediction of a VLM. •At inference, KineMask infer high-level outcomes of object motion from a single frame📷.
1
0
0
@Davidromogr
David Romero
1 month
• We enable object-based control with a novel training strategy. Paired with synthetic data, KineMask 🎭enables pretrained diffusion models to synthesize realistic object interactions in real-world input scenes.
1
0
0
@Davidromogr
David Romero
1 month
🚀 Introducing KineMask 🎭 , an approach for physics-guided video generation that enables realistic rigid body control and interactions. Given a Single Image and an Object Velocity ❗, KineMask generates videos with inferred motions and future object interactions !!!
1
0
2
@Davidromogr
David Romero
2 months
Great to see Sam Altman @sama at @mbzuai today !!!
0
0
1
@Davidromogr
David Romero
9 months
I hope that CVQA will motivate and push the frontier towards more advanced and diverse multimodal models. Check out our work and the oral presentation at NeurIPS: Paper: https://t.co/DaZ1qjEzuv Presentation: https://t.co/5eOryQPTd0 Dataset:
Tweet card summary image
huggingface.co
@mbzuai
MBZUAI
9 months
Language is a cornerstone of cultural identity. Our mission to preserve cultural heritage drives our innovative CVQA dataset, led by Ph.D. student David Romero and his team. The CVQA dataset challenges models with 10K+ questions across 31 languages and covers 10 diverse
0
0
9
@thamar_solorio
thamar |
11 months
@Davidromogr did a fantastic job presenting!
0
1
8
@MSFTResearch
Microsoft Research
1 year
Pranjal Chitale discusses the 2024 NeurIPS work CVQA. Spanning 31 languages & the cultures of 30 countries, this VQA benchmark was created with native speakers & cultural experts to evaluate model performance across diverse linguistic & cultural contexts. https://t.co/N1DF79Em71
0
8
22
@pcastr
Pablo Samuel Castro
11 months
¡Ecuatorianos en #NeurIPS2024 ! 🇪🇨🤖
1
2
50
@Davidromogr
David Romero
11 months
The man has spoken !!!
0
0
4
@jcblaisecruz
Blaise Cruz
1 year
The CVQA project is currently being presented by @Davidromogr at the East Meeting Room 1-3, do drop by if you’re around! Catch us as well at the poster sessions in the West Hall at 4:30 onwards :)
1
4
15
@satyanadella
Satya Nadella
1 year
Proud of the depth and breadth of contributions from @Microsoft researchers at this year's @NeurIPSConf!
@MSFTResearch
Microsoft Research
1 year
We’re excited to be a part of #NeurIPS2024! Explore the future of AI with over 100 groundbreaking papers, including oral and spotlight sessions, on reinforcement learning, advanced language model training, and multilingual, culturally inclusive benchmarks: https://t.co/QBDuYcDlT6
40
42
342
@Davidromogr
David Romero
1 year
Excited to attend #NeurIPS2024 to present CVQA (culturally relevant, human-written multilingual VQA) !! our recent work at @mbzuai. I'll be giving an oral session: 📍East Meeting Room 1-3 🗓️Thu 12 Dec 3:30 p.m. PST and the poster session: 📍West Ballroom A-D #5110 🗓️Thu 12 Dec
@Davidromogr
David Romero
1 year
Happy to introduce our latest work , CVQA, a culturally-diverse multilingual Visual QA Benchmark, covering a rich set of languages and cultures from 28 countries across 4 continents. Work done @mbzuai w/ @Chenyang_Lyu, @haryoaw, @thamar_solorio, @AlhamFikri and 70+ contributors
1
5
15
@AlhamFikri
Alham Fikri Aji
1 year
David (@Davidromogr) the first author of CVQA (culturally relevant, human-written multilingual VQA) is presenting our work at the @mbzuai student research showcase Next week, he’ll present it in an oral session at @NeurIPSConf This is one of the works I’m most proud of!
@AlhamFikri
Alham Fikri Aji
1 year
🎉Happy to share our recent collaborative effort on building a culturally diverse, multilingual visual QA dataset! CVQA consists of over 9,000 questions across 28 countries, covering 26 languages (with more to be added!) 🌐 https://t.co/2JzSqhyMmp 📜 https://t.co/UbtRGvFvJ2
0
7
36