Antoine Miech
@antoine77340
Followers
1K
Following
1K
Media
27
Statuses
237
Ornithologist @GoogleDeepMind 🦩, Gemini Multimodal
Joined June 2010
What if you could not only watch a generated video, but explore it too? 🌐 Genie 3 is our groundbreaking world model that creates interactive, playable environments from a single text prompt. From photorealistic landscapes to fantasy realms, the possibilities are endless. 🧵
828
3K
14K
How does Google's new agentic browser (Project Mariner) compare with ChatGPT Operator? I tested them head-to-head, using both platform's suggested prompts (to make it fair!) 👇
24
160
2K
#Veo3 further blurs the lines between reality and imagination with audio, stronger text adherence, and richer visual details.
58
184
1K
We're barely 2 years from Will Smith eating spaghetti...
Say goodbye to the silent era of video generation: Introducing Veo 3 — with native audio generation. 🗣️ Quality is up from Veo 2, and now you can add dialogue between characters, sound effects and background noise. Veo 3 is available now in the @GeminiApp for Google AI Ultra
411
4K
92K
Thrilled to share our latest advances in video understanding 📽️: Gemini 2.5 Pro is a truly magical model to play with, excelling in traditional video analysis and unlocking new use cases I could not imagine a few months ago🪄 More in 🧵 and @Google blog:
developers.googleblog.com
11
51
381
Introducing YouTube video 🎥 link support in Google AI Studio and the Gemini API. You can now directly pass in a YouTube video and the model can usage its native video understanding capabilities to use that, with just a link! 🚢
291
374
3K
You can now paste YouTube links *directly* to use Gemini audio-video understanding on https://t.co/8ZC9dwbeXC 😀
4
13
155
some cool examples with Gemini 2.0 native image output 🧵
66
189
4K
Super excited to announce what I’ve been working on for the past few months 💃 GEMMA 3 is out today! It supports 140+ languages, has a context length of 128k tokens and the best part? It’s natively multimodal! 📸
10
26
347
Introducing Arena-Price Plot! 💰📊 An interactive plot of price vs. performance trade-offs for LLMs. Frontier efficiency models: 🔹 Gemini-2.0-Flash/Lite by @GoogleDeepMind 🔹 DeepSeek-R1 by @deepseek_ai 🔹 GPT-4o by @OpenAI 🔹 Yi-Lightning by @01AI_Yi 🔹 Ministral 8B by
62
137
795
Today, we’re announcing Veo 2: our state-of-the-art video generation model which produces realistic, high-quality clips from text or image prompts. 🎥 We’re also releasing an improved version of our text-to-image model, Imagen 3 - available to use in ImageFX through
268
1K
7K
Gemini2 Flash on the challenge of what the internet has been asking for: breaking down "draw the rest of the owl" into actual steps with interleaved generation. not perfect yet, but it’s on the edge of something super cool...
18
66
508
Gemini can now browse the web as you would do on a web browser!
We are investing in the frontiers of agentic capabilities with a few early prototypes. Project Mariner is built with Gemini 2.0 and is able to understand and reason across information - pixels, text, code, images + forms - on your browser screen, and then uses that info to
0
5
26
Welcome to the world, Gemini 2.0 ✨ our most capable AI model yet. We're first releasing an experimental version of 2.0 Flash ⚡ It has better performance, new multimodal output, @Google tool use - and paves the way for new agentic experiences. 🧵 https://t.co/ywY2oZv76p
73
424
2K
7 examples of Gemini's multimodal capabilities in action (with code and prompts) 🤯🧵
28
42
569
Woah, huge news again from Chatbot Arena🔥 @GoogleDeepMind’s just released Gemini (Exp 1121) is back stronger (+20 points), tied #1🏅Overall with the latest GPT-4o-1120 in Arena! Ranking gains since Gemini-Exp-1114: - Overall #3 → #1 - Overall (StyleCtrl): #5 -> #2 - Hard
Say hello to gemini-exp-1121! Our latest experimental gemini model, with: - significant gains on coding performance - stronger reasoning capabilities - improved visual understanding Available on Google AI Studio and the Gemini API right now:
49
149
949
LMSYS now has an image understanding track, and it is pretty addictive!
Exciting news - Chatbot Arena now supports image uploads📸 Challenge GPT-4o, Gemini, Claude, and LLaVA with your toughest questions. Plot to code, VQA, story telling, you name it. Let's get creative and have fun! Leaderboard coming soon. Credits to builders @chrischou03
0
2
17
Ooh! 🎉 @tldraw got spatialized math reasoning working using Gemini 1.5 Flash. ✏️ ➗
0
3
10
Updated results: Gemini 1.5 Flash is rocking, outperforming GPT-4o as well! 🤘
@_TobiasLee Thank you for releasing such a high quality benchmark, they tend to be quite rare for long form video understanding! Would it be possible to add Gemini 1.5 Flash to the leaderboard as well? 🙏 it is faster and cheaper to run than 1.5 Pro 😊
2
14
111