
Meera Hahn
@MeeraHahn
Followers
288
Following
140
Media
2
Statuses
34
Research Scientist @GoogleAI PhD in Computer Science @GeorgiaTech Undergrad @EmoryUniversity
Atlanta, GA
Joined November 2019
Exciting new work from @sihyun_yu and our team at Google Deep Mind! Memory-Augmented Latent Transformers (MALT) Diffusion, a new diffusion model specialized for long video generation! https://t.co/gcDZr5mVbf
arxiv.org
Diffusion models are successful for synthesizing high-quality videos but are limited to generating short clips (e.g., 2-10 seconds). Synthesizing sustained footage (e.g. over minutes) still...
3
17
111
Check out our tech report on proactive T2I agents that ask clarification questions to reduce uncertainty! With this agent, we obtain 2 times higher VQAScore in just 5 turns!🤯 We've open-sourced our agent code powered by #Gemini @GoogleDeepMind! 🚀 Code: https://t.co/67dTGDQpZI
Tired of endless prompt tweaking? We've released a tech report on proactive text-to-image agents powered by #Gemini @GoogleDeepMind! Our agents ask clarifying questions and use belief graphs to understand what you really want. https://t.co/jRwjxtqALx
https://t.co/XFVvva7r96
2
22
67
Many thanks to @MeeraHahn , Wenjun Zeng, Nithish Kannen, Rich Galt, Kartikeya Badola, @_beenkim for making this happen! Stay tuned for code release!
0
1
5
Tired of endless prompt tweaking? We've released a tech report on proactive text-to-image agents powered by #Gemini @GoogleDeepMind! Our agents ask clarifying questions and use belief graphs to understand what you really want. https://t.co/jRwjxtqALx
https://t.co/XFVvva7r96
1
5
17
6/ Finally, our model can be used to generate videos with consistent 3D camera motion.
3
17
93
2/ website: https://t.co/atH5wzRudu Our approach has two key design decisions. First, we use a causal encoder to compress images and videos in a shared latent space.
2
15
71
We introduce W.A.L.T, a diffusion model for photorealistic video generation. Our model is a transformer trained on image and video generation in a shared latent space. 🧵👇
51
250
1K
If you’re having trouble keeping up with Video AI😅, there have been 5 state-of-the-art generative video models released *in last 7 days*: 🤯😎🧵
61
667
7K
Have you ever wondered about emergent intelligence in robotic agents? This work shows interesting emergent intelligence and behaviors in blind navigation agents! Blind agents learn maps as they navigate. This allows them to navigate as successfully as an agent with vision
How do 'map-less' agents navigate? They learn to build implicit maps of their environment in their hidden state! We study 'blind' AI navigation agents and find the following 🧵
0
1
15
How can we fill in missing pulsative sensor data? Prior state-of-the-art fails in our novel setting, despite its well-defined temporal structure. Checkout our #NeurIPS2022 paper, PulseImpute, @ 4 pm CST! arxiv: https://t.co/Hbv2x7ZvkP github: https://t.co/bTManuLyEH
0
11
17
i trained an ai chatbot on my childhood journal entries - so that i could engage in real-time dialogue with my "inner child" some reflections below:
567
6K
46K
Dense self-supervised learning from multiple 3D viewpoints → dense feature representations that generalize both to novel object instances and to novel categories of instances. Checkout our #NeurIPS2022 paper! arxiv: https://t.co/rxzdrScII1 github: https://t.co/5n98W7Wykt
2
20
68
We model indoor environments using FPV panoramic navigation graphs and introduce a visiolinguistic transformer model, LED-Bert, which scores the alignment between navigation graph nodes and dialogs and achieves SOTA performance on the LED task!
0
0
1
✨Transformer-based Localization from Embodied Dialog with Large-scale Pre-training✨ has been accepted as an oral at @aaclmeeting! https://t.co/L7SSGfqk0w w/ @RehgJim
1
5
19
Today, along with my collaborators at @GoogleAI, we announce DreamBooth! It allows a user to generate a subject of choice (pet, object, etc.) in myriad contexts and with text-guided semantic variations! The options are endless. (Thread 👇) webpage: https://t.co/EDpIyalqiK 1/N
42
403
2K
A new year, a new shameless twitter plug: Check out our Toys4K 3D object dataset 4K instances, 105 categories, 15+ instances per category https://t.co/K85J007iru
0
12
62
Excited to present our #neurips work NRNS!
#NeurIPS21 paper on No RL No Simulation (NRNS): Learning to Navigate without Navigating! NRNS not only beats RL/IL algorithms in simulation but when it comes to real-world…there is no sim2real required! Webpage: https://t.co/MuDOM91LJP Code: https://t.co/WBXxyrr8uY (1/2)
0
3
9