ebugliarello Profile Banner
Emanuele Bugliarello Profile
Emanuele Bugliarello

@ebugliarello

Followers
1K
Following
1K
Media
62
Statuses
218

Multimodal researcher @GoogleDeepMind

Grenoble, France
Joined August 2019
Don't wanna be here? Send us removal request.
@arankomatsuzaki
Aran Komatsuzaki
2 months
Dynamic CFG: adaptive guidance for diffusion models • Static CFG = “one-size-fits-all” fails across prompts • New method: online feedback from latent evaluators (CLIP, fidelity, prefs) → dynamic per-step CFG • Just +1% overhead, big gains in alignment, quality & text
1
30
146
@pinelopip3
Nelly Papalampidi
2 months
Preprint is out: we solve the CFG conundrum! Simple or out of distribution prompts benefit from unconditional generation, but challenging ones require to dial up the guidance strength. But no need to rely on empirical observations. We introduce dynamic CFG via online feedback👇
@arankomatsuzaki
Aran Komatsuzaki
2 months
Dynamic CFG: adaptive guidance for diffusion models • Static CFG = “one-size-fits-all” fails across prompts • New method: online feedback from latent evaluators (CLIP, fidelity, prefs) → dynamic per-step CFG • Just +1% overhead, big gains in alignment, quality & text
2
6
29
@ebugliarello
Emanuele Bugliarello
6 months
You gotta check out this giraffe's flow! 🎤 #veo3
12
33
487
@ebugliarello
Emanuele Bugliarello
8 months
TL;DR: A new benchmark (WYD) 🤹 Larger & more diverse than existing ones 🔎 With fine-grained & meticulous annotations by yours truly & metrics 🎯 For video-level and human-targeted evaluations 🫂 That correlate well with human preferences ⇒ many new measurable challenges! 🎳
0
0
1
@ebugliarello
Emanuele Bugliarello
8 months
Frustrated with trying to animate characters with video generation models? And end up muttering "What are you doing?" We too. So, we made a new benchmark (WYD) to push controllable human generation for real-world settings! 📄 https://t.co/lPTJGB03bp 🧑‍💻 https://t.co/btS73xzdBZ
1
1
8
@aclmeeting
ACL 2025
8 months
📢#ACL2025NLP This year we received 8276 submissions 👏 which is the highest number in the history of ACL conferences 🙌 If you are not yet involved as a reviewer, AC or SAC, we would encourage you to volunteer as an (emergency) AC or reviewer https://t.co/UhPTpK7hq6 🙏
Tweet card summary image
docs.google.com
Use this form to volunteer to join the ACL 2025 program committee as an (emergency) reviewer or area chair (AC). The reviewers need to be available in March and early April 2025. ACs need to be...
6
42
155
@aclmeeting
ACL 2025
10 months
📢 Have you been wondering what workshops are brewing in the *ACL venues in 2025? The list that we've been waiting for in here. Feel free to tag or repost with the organisers. Below are ACL 2025 workshops: #ACL2025NLP #NLProc #workshop 🧵
2
22
67
@AndreasPSteiner
Andreas Steiner
11 months
🚀🚀PaliGemma 2 is our updated and improved PaliGemma release using the Gemma 2 models and providing new pre-trained checkpoints for the full cross product of {224px,448px,896px} resolutions and {3B,10B,28B} model sizes. 1/7
5
54
260
@ahmetius
Ahmet Iscen
11 months
Want to work on the future of multimodal AI? Our Google DeepMind team in Grenoble, led by @CordeliaSchmid, is hiring interns for multimodal AI research (long-video understanding and visual reasoning in 2D and 3D). Email ai.gnb.hiring@gmail.com or find me at #NeurIPS2024!
5
17
184
@aclmeeting
ACL 2025
11 months
📢#ACL2025 is inviting nominations and self-nominations to the ACL 2025 programme committee (reviewers or area chair) ➡️ https://t.co/YWQikbGZIv deadline for nominations 🗓️ 16 Dec 2024. 🙏
Tweet card summary image
docs.google.com
Use this form to express your interest in joining the ACL 2025 programme committee as a reviewer or area chair (AC). The review period is 1st to 20th of March 2025. ACs need to be available for...
0
31
88
@alirezafathi
Alireza Fathi
1 year
Our team at Google DeepMind is seeking a Research Scientist with a strong publication record (multiple first-author papers) on multi-modal LLMs in top ML venues like NeurIPS, ICLR, CVPR. Email me at af_hiring@google.com @CordeliaSchmid
4
49
383
@Cardiff_NLP
Cardiff NLP
1 year
Day 2 starts in a few hours, let's go! #cardiffnlpworkshop
0
11
14
@ebugliarello
Emanuele Bugliarello
1 year
Embrace cultural diversity in your large-scale data! 🌎🌍🌏 @angelinepouget’s study shows that (quantitatively) you have no reason not to 🌸
@giffmana
Lucas Beyer (bl16)
1 year
PSA: Stop pretraining your VLMs on EN-filtered data, even if it improves ImageNet and COCO‼️ Doing so impairs the model's understanding of non-English cultures❗️ I argued for years, now finally publish concrete results for this (imo) intuitively obvious recommendation A🧾🧶
1
1
7
@giffmana
Lucas Beyer (bl16)
1 year
PSA: Stop pretraining your VLMs on EN-filtered data, even if it improves ImageNet and COCO‼️ Doing so impairs the model's understanding of non-English cultures❗️ I argued for years, now finally publish concrete results for this (imo) intuitively obvious recommendation A🧾🧶
@ibomohsin
Ibrahim Alabdulmohsin | إبراهيم العبدالمحسن
1 year
Want your VLM to reflect the world's rich diversity 🌍? We’re very excited to share our recent research on this topic. TLDR: to build truly inclusive models that work for everyone, don’t filter by English, and check out our recommended evaluation benchmarks. (1/7)
10
32
281
@ebugliarello
Emanuele Bugliarello
1 year
@GoogleDeepMind @OliviaW47557022 @ChuhanZhang5 @isabela_alb @wangsu_gdm @yasumasa_onoe @CyrusRashtchian @jponttuset @aidanematzadeh Overall: ☑️ Fine-grained human rating templates are more consistent with each other 🧑‍⚖️ Reliable prompts and fine-grained templates lead to consistent model ordering 🔝 To compare auto-eval metrics, reliable prompts better measure alignment Check out our paper for more details!
0
0
3
@ebugliarello
Emanuele Bugliarello
1 year
@GoogleDeepMind @OliviaW47557022 @ChuhanZhang5 @isabela_alb @wangsu_gdm @yasumasa_onoe @CyrusRashtchian @jponttuset @aidanematzadeh We also propose Gecko, a new VQA+LLM metric that improves upon prior work by: 🔍 better coverage of visual words in QAs 🪣 filtering hallucinated QAs 🤷 accounting for the uncertainty in the VQA scores Gecko obtains best correlation across human templates on Gecko2K and TIFA160
1
0
3
@ebugliarello
Emanuele Bugliarello
1 year
@GoogleDeepMind @OliviaW47557022 @ChuhanZhang5 @isabela_alb @wangsu_gdm @yasumasa_onoe @CyrusRashtchian @jponttuset @aidanematzadeh This leads to 100K+ ratings, which allow us to: ⚖️ compare different templates → finer-grained templates (WL and DSG(H)) have better inter-annotator agreement ✨ define a subset of *reliable* prompts, Gecko2K-rel, where annotators agree across models and template
1
0
3
@ebugliarello
Emanuele Bugliarello
1 year
@GoogleDeepMind @OliviaW47557022 @ChuhanZhang5 @isabela_alb @wangsu_gdm @yasumasa_onoe @CyrusRashtchian @jponttuset @aidanematzadeh We curate a new set of prompts, Gecko2K, that allows us to assess a variety of skills (eg. counting, spatial, texture) in T2I models We then generate images with 4 T2I models, and run human evaluation with 4 different rating templates (eg. side-by-side comparison, or Likert)
1
0
4
@ebugliarello
Emanuele Bugliarello
1 year
Check out Gecko 🦎: @GoogleDeepMind's latest work looking at how to evaluate text-to-image technology with: 📊 a new benchmark 🕵️ 100K+ human ratings of state-of-the-art T2I models 🤖 a better human-correlated auto-eval metric https://t.co/CyB4YwgYjh
5
22
99
@reach_vb
Vaibhav (VB) Srivastav
1 year
PaliGemma - Open Vision Model from Google! 💎 > 3B parameter model - SigLiP + Gemma 2B > Supports images upto 896 x 896 resolution > Capable of Document understanding, Image detection, visual question answering, captioning and more > In addition to general purpose checkpoints
6
68
315