Emanuele Bugliarello @ebugliarello X Profile

Emanuele Bugliarello

@ebugliarello

Followers

1K

Following

1K

Media

62

Statuses

218

Multimodal researcher @GoogleDeepMind

https://t.co/G7azkR7xGl

Grenoble, France

Joined August 2019

Don't wanna be here? Send us removal request.

Aran Komatsuzaki

@arankomatsuzaki

2 months

Dynamic CFG: adaptive guidance for diffusion models • Static CFG = “one-size-fits-all” fails across prompts • New method: online feedback from latent evaluators (CLIP, fidelity, prefs) → dynamic per-step CFG • Just +1% overhead, big gains in alignment, quality & text

1

30

146

Nelly Papalampidi

@pinelopip3

2 months

Preprint is out: we solve the CFG conundrum! Simple or out of distribution prompts benefit from unconditional generation, but challenging ones require to dial up the guidance strength. But no need to rely on empirical observations. We introduce dynamic CFG via online feedback👇

Aran Komatsuzaki

@arankomatsuzaki

2 months

Dynamic CFG: adaptive guidance for diffusion models • Static CFG = “one-size-fits-all” fails across prompts • New method: online feedback from latent evaluators (CLIP, fidelity, prefs) → dynamic per-step CFG • Just +1% overhead, big gains in alignment, quality & text

2

6

29

Emanuele Bugliarello

@ebugliarello

6 months

You gotta check out this giraffe's flow! 🎤 #veo3

12

33

487

Emanuele Bugliarello

@ebugliarello

8 months

TL;DR: A new benchmark (WYD) 🤹 Larger & more diverse than existing ones 🔎 With fine-grained & meticulous annotations by yours truly & metrics 🎯 For video-level and human-targeted evaluations 🫂 That correlate well with human preferences ⇒ many new measurable challenges! 🎳

0

1

Emanuele Bugliarello

@ebugliarello

8 months

Frustrated with trying to animate characters with video generation models? And end up muttering "What are you doing?" We too. So, we made a new benchmark (WYD) to push controllable human generation for real-world settings! 📄 https://t.co/lPTJGB03bp 🧑‍💻 https://t.co/btS73xzdBZ

1

8

ACL 2025

@aclmeeting

8 months

📢#ACL2025NLP This year we received 8276 submissions 👏 which is the highest number in the history of ACL conferences 🙌 If you are not yet involved as a reviewer, AC or SAC, we would encourage you to volunteer as an (emergency) AC or reviewer https://t.co/UhPTpK7hq6 🙏

docs.google.com

Use this form to volunteer to join the ACL 2025 program committee as an (emergency) reviewer or area chair (AC). The reviewers need to be available in March and early April 2025. ACs need to be...

6

42

155

ACL 2025

@aclmeeting

10 months

📢 Have you been wondering what workshops are brewing in the *ACL venues in 2025? The list that we've been waiting for in here. Feel free to tag or repost with the organisers. Below are ACL 2025 workshops: #ACL2025NLP #NLProc #workshop 🧵

2

22

67

Andreas Steiner

@AndreasPSteiner

11 months

🚀🚀PaliGemma 2 is our updated and improved PaliGemma release using the Gemma 2 models and providing new pre-trained checkpoints for the full cross product of {224px,448px,896px} resolutions and {3B,10B,28B} model sizes. 1/7

5

54

260

Ahmet Iscen

@ahmetius

11 months

Want to work on the future of multimodal AI? Our Google DeepMind team in Grenoble, led by @CordeliaSchmid, is hiring interns for multimodal AI research (long-video understanding and visual reasoning in 2D and 3D). Email ai.gnb.hiring@gmail.com or find me at #NeurIPS2024!

5

17

184

ACL 2025

@aclmeeting

11 months

📢#ACL2025 is inviting nominations and self-nominations to the ACL 2025 programme committee (reviewers or area chair) ➡️ https://t.co/YWQikbGZIv deadline for nominations 🗓️ 16 Dec 2024. 🙏

docs.google.com

Use this form to express your interest in joining the ACL 2025 programme committee as a reviewer or area chair (AC). The review period is 1st to 20th of March 2025. ACs need to be available for...

0

31

88

Alireza Fathi

@alirezafathi

1 year

Our team at Google DeepMind is seeking a Research Scientist with a strong publication record (multiple first-author papers) on multi-modal LLMs in top ML venues like NeurIPS, ICLR, CVPR. Email me at af_hiring@google.com @CordeliaSchmid

4

49

383

Cardiff NLP

@Cardiff_NLP

1 year

Day 2 starts in a few hours, let's go! #cardiffnlpworkshop

0

11

14

Emanuele Bugliarello

@ebugliarello

1 year

Embrace cultural diversity in your large-scale data! 🌎🌍🌏 @angelinepouget’s study shows that (quantitatively) you have no reason not to 🌸

Lucas Beyer (bl16)

@giffmana

1 year

PSA: Stop pretraining your VLMs on EN-filtered data, even if it improves ImageNet and COCO‼️ Doing so impairs the model's understanding of non-English cultures❗️ I argued for years, now finally publish concrete results for this (imo) intuitively obvious recommendation A🧾🧶

1

7

Lucas Beyer (bl16)

@giffmana

1 year

PSA: Stop pretraining your VLMs on EN-filtered data, even if it improves ImageNet and COCO‼️ Doing so impairs the model's understanding of non-English cultures❗️ I argued for years, now finally publish concrete results for this (imo) intuitively obvious recommendation A🧾🧶

Ibrahim Alabdulmohsin | إبراهيم العبدالمحسن

@ibomohsin

1 year

Want your VLM to reflect the world's rich diversity 🌍? We’re very excited to share our recent research on this topic. TLDR: to build truly inclusive models that work for everyone, don’t filter by English, and check out our recommended evaluation benchmarks. (1/7)

10

32

281

Emanuele Bugliarello

@ebugliarello

1 year

@GoogleDeepMind @OliviaW47557022 @ChuhanZhang5 @isabela_alb @wangsu_gdm @yasumasa_onoe @CyrusRashtchian @jponttuset @aidanematzadeh Overall: ☑️ Fine-grained human rating templates are more consistent with each other 🧑‍⚖️ Reliable prompts and fine-grained templates lead to consistent model ordering 🔝 To compare auto-eval metrics, reliable prompts better measure alignment Check out our paper for more details!

0

3

Emanuele Bugliarello

@ebugliarello

1 year

@GoogleDeepMind @OliviaW47557022 @ChuhanZhang5 @isabela_alb @wangsu_gdm @yasumasa_onoe @CyrusRashtchian @jponttuset @aidanematzadeh We also propose Gecko, a new VQA+LLM metric that improves upon prior work by: 🔍 better coverage of visual words in QAs 🪣 filtering hallucinated QAs 🤷 accounting for the uncertainty in the VQA scores Gecko obtains best correlation across human templates on Gecko2K and TIFA160

1

0

3

Emanuele Bugliarello

@ebugliarello

1 year

@GoogleDeepMind @OliviaW47557022 @ChuhanZhang5 @isabela_alb @wangsu_gdm @yasumasa_onoe @CyrusRashtchian @jponttuset @aidanematzadeh This leads to 100K+ ratings, which allow us to: ⚖️ compare different templates → finer-grained templates (WL and DSG(H)) have better inter-annotator agreement ✨ define a subset of *reliable* prompts, Gecko2K-rel, where annotators agree across models and template

1

0

3

Emanuele Bugliarello

@ebugliarello

1 year

@GoogleDeepMind @OliviaW47557022 @ChuhanZhang5 @isabela_alb @wangsu_gdm @yasumasa_onoe @CyrusRashtchian @jponttuset @aidanematzadeh We curate a new set of prompts, Gecko2K, that allows us to assess a variety of skills (eg. counting, spatial, texture) in T2I models We then generate images with 4 T2I models, and run human evaluation with 4 different rating templates (eg. side-by-side comparison, or Likert)

1

0

4

Emanuele Bugliarello

@ebugliarello

1 year

Check out Gecko 🦎: @GoogleDeepMind's latest work looking at how to evaluate text-to-image technology with: 📊 a new benchmark 🕵️ 100K+ human ratings of state-of-the-art T2I models 🤖 a better human-correlated auto-eval metric https://t.co/CyB4YwgYjh

5

22

99

Vaibhav (VB) Srivastav

@reach_vb

1 year

PaliGemma - Open Vision Model from Google! 💎 > 3B parameter model - SigLiP + Gemma 2B > Supports images upto 896 x 896 resolution > Capable of Document understanding, Image detection, visual question answering, captioning and more > In addition to general purpose checkpoints

6

68

315