Grace Luo @ ICCV 2025 Profile
Grace Luo @ ICCV 2025

@graceluo_

Followers
1K
Following
604
Media
18
Statuses
41

phd student @berkeley_ai, vision + language

Joined June 2021
Don't wanna be here? Send us removal request.
@graceluo_
Grace Luo @ ICCV 2025
5 months
✨New preprint: Dual-Process Image Generation! We distill *feedback from a VLM* into *feed-forward image generation*, at inference time. The result is flexible control: parameterize tasks as multimodal inputs, visually inspect the images with the VLM, and update the generator.🧵
23
176
1K
@graceluo_
Grace Luo @ ICCV 2025
7 days
If you're at #ICCV2025, swing by our poster today! We'll show how you can directly fine-tune an image generator with a VLM's language modeling loss. 📍 https://t.co/bu6WrGtVrF 🔗 https://t.co/67P9uxWS1y cc @jongranskog @holynski_ @trevordarrell
1
5
43
@graceluo_
Grace Luo @ ICCV 2025
9 days
Check out our new project at https://t.co/ApCyLKjo4F! Go and poke the visualizations on our website; you might even find a few easter eggs 🐣
echo-bench.github.io
@aomaru_21490
Jiaxin Ge
9 days
✨Introducing ECHO, the newest in-the-wild image generation benchmark! You’ve seen new image models and new use cases discussed on social media, but old benchmarks don’t test them! We distilled this qualitative discussion into a structured benchmark. 🔗 https://t.co/wJmmEY8TFQ
0
0
16
@graceluo_
Grace Luo @ ICCV 2025
4 months
I'm presenting a poster at #ICML2025 today! Stop by if you want to learn how VLMs encode different representations of the same task (spoiler: it's the same). 🌐 https://t.co/Dm5PqGefkW 🔗 https://t.co/TCsFZ21Npa cc @_amirbar @trevordarrell
2
15
128
@graceluo_
Grace Luo @ ICCV 2025
5 months
(6/n) Special thanks to my co-authors @jongranskog, @holynski_, @trevordarrell! This was a great academic collaboration with @runwayml, especially @agermanidis, who was super supportive of our very experimental ideas throughout the entire process.
1
0
17
@graceluo_
Grace Luo @ ICCV 2025
5 months
(5/n) Research is non-linear! We started this project more than a year ago – first by training a diffusion captioner (a VLM that encodes images with diffusion hyperfeatures). We’re not working on that direction anymore, but here’s a peek at that first prototype:
1
0
24
@graceluo_
Grace Luo @ ICCV 2025
5 months
(4/n) Check out our paper + code! Our codebase lets you mix-and-match different off-the-shelf image generators and VLMs, and can run on Nvidia RTX 4090s. Page: https://t.co/67P9uxWS1y Paper: https://t.co/w5LRDO7lWg Code:
Tweet card summary image
github.com
Official PyTorch Implementation for Dual-Process Image Generation, ICCV 2025 - g-luo/dual_process
1
2
34
@graceluo_
Grace Luo @ ICCV 2025
5 months
(3/n) You can get pretty creative, because VLMs afford a flexible interface. We play around with implementing spatial controls via *visual prompting*, where we overlay the control over the image and ask the VLM if they match, then optimize for an image that matches the control.
2
0
19
@graceluo_
Grace Luo @ ICCV 2025
5 months
(2/n) Our method re-uses the visual instruction tuning loss originally used to train the VLM, to instead optimize the weights of the image generator.
1
0
29
@graceluo_
Grace Luo @ ICCV 2025
1 year
(7/n) Check out our paper and website for more info! Website: https://t.co/b4mq6B1PlX Paper: https://t.co/hqbPcuCgDl Code:
0
0
4
@graceluo_
Grace Luo @ ICCV 2025
1 year
(6/n) The idea of a task vector is not new; we study its cross-modal properties following prior work in LLMs (e.g., https://t.co/gACnnwkCyI, https://t.co/fRIO6s3ZT4).
1
0
5
@graceluo_
Grace Luo @ ICCV 2025
1 year
(5/n) The most surprising result, at least to me, is that task vectors can be patched from the base LLM to its corresponding fine-tuned VLM. Here we patch from Mistral to Idefics2. This means that the VLM can re-purpose functions learned entirely in language onto image queries.
1
0
5
@graceluo_
Grace Luo @ ICCV 2025
1 year
(4/n) Motivated by this similarity in task representations, we explore mixing and matching the task specification and query format. We call this cross-modal patching.
1
0
2
@graceluo_
Grace Luo @ ICCV 2025
1 year
(3/n) We start by looking at how the model processes the final token to execute the task. Conditioned on either text or image ICL, the token undergoes three phases across model layers: it first resembles the colon input, a meta-summary of the task, then the final answer.
1
0
1
@graceluo_
Grace Luo @ ICCV 2025
1 year
(2/n) Our main finding is that task representations in VLMs are consistent across modality (text, image) and specification (example, instruction).
1
0
3
@graceluo_
Grace Luo @ ICCV 2025
1 year
In a new preprint, we show that VLMs can perform cross-modal tasks... ...since text ICL 📚, instructions 📋, and image ICL 🖼️ are compressed into similar task representations. See “Task Vectors are Cross-Modal”, work w/ @trevordarrell, @_amirbar. https://t.co/b4mq6B1PlX
5
18
99
@graceluo_
Grace Luo @ ICCV 2025
1 year
Our Knowledge in Generative Models workshop #ECCV2024 is happening in a few hours! ⏰ Monday, Sept 30th, 2-6PM CEST 📍 Brown 2 (note location change from Brown 1) 🔗 https://t.co/3BWyDHaFNn
Tweet card summary image
sites.google.com
Overview
@anand_bhattad
Anand Bhattad
1 year
We are organizing a new workshop on "Knowledge in Generative Models" at #ECCV2024 to explore how generative models learn representations of the visual world and how we can use them for downstream applications. For the schedule and more details, visit our website: 🔗Website:
1
0
10
@anand_bhattad
Anand Bhattad
1 year
We are organizing a new workshop on "Knowledge in Generative Models" at #ECCV2024 to explore how generative models learn representations of the visual world and how we can use them for downstream applications. For the schedule and more details, visit our website: 🔗Website:
3
10
91
@graceluo_
Grace Luo @ ICCV 2025
1 year
Come drop by our poster for🔮Readout Guidance at #CVPR2024 this Wednesday! 📍Arch 4A-E, Poster #332 📅Wed 19 June 5-6:30PM PST 🌐 https://t.co/GyV2tghSJk 🔗 https://t.co/EZSQeo2EeR w/ @trevordarrell, @oliver_wang2, @danbgoldman, @holynski_
0
4
17
@graceluo_
Grace Luo @ ICCV 2025
2 years
Update on 🔮Readout Guidance ( https://t.co/EZSQeo26pj)! We open sourced the code – check out our demos, model weights, and training code: https://t.co/oNSM9IeA6B. Here’s a teaser of what you can do with our method:
4
48
240