Kevin Lu Profile
Kevin Lu

@kevinlu4588

Followers
76
Following
753
Media
4
Statuses
21

Diffusion & protein language model Research @ Northeastern (Bau Lab) @ NeurIPS 2025

Joined April 2024
Don't wanna be here? Send us removal request.
@rohitgandikota
Rohit Gandikota
9 days
We discovered how to fix diffusion model's diversity issues using interpretability! It's all in the first time-step!โฑ๏ธ Turns out the concepts to be diverse are present in the model - it simply doesn't use them! Checkout our @wacv_official work - we added theoretical evidence๐Ÿ‘‡
@rohitgandikota
Rohit Gandikota
9 months
Why do distilled diffusion models generate similar-looking images? ๐Ÿค” Our Diffusion Target (DT) visualization reveals the secret to diversity. It is the very first time-step! Andโ€”there is a simple, training-free way to make them more diverse! Here is how: ๐Ÿงต๐Ÿ‘‡
1
22
179
@rohitgandikota
Rohit Gandikota
11 days
We tested several unlearning methods and found none of them really erase knowledge from the model - they simply hide it! ๐Ÿง What does this mean? We must tread carefully with unlearning research within diffusion models๐Ÿšจ Here is what we learned ๐Ÿงต๐Ÿ‘‡(led by @kevinlu4588)
@kevinlu4588
Kevin Lu
28 days
Excited to share our paper โ€œWhen Are Concepts Erased from Diffusion Models?โ€ at @NeurIPSConf! We introduce two conceptual models for erasure mechanisms in diffusion models, and a suite of probes to recover supposedly forgotten concepts. Project website:
1
14
45
@kevinlu4588
Kevin Lu
28 days
Probes include: 1. Inpainting & diffusion completion 2. Diffusion trajectory expansion via custom noise scheduler 3. Latent classifier guidance (on par with Textual Inversion) 4. Dynamic concept tracing Erasing methods can fail with the original prompt + light conditioning!
1
0
2
@kevinlu4588
Kevin Lu
28 days
Excited to share our paper โ€œWhen Are Concepts Erased from Diffusion Models?โ€ at @NeurIPSConf! We introduce two conceptual models for erasure mechanisms in diffusion models, and a suite of probes to recover supposedly forgotten concepts. Project website:
Tweet card summary image
unerasing.baulab.info
Investigating whether concept erasure in diffusion models truly removes knowledge or merely avoids it, with a comprehensive suite of probing techniques.
2
11
38
@arnab_api
Arnab Sen Sharma
1 month
How can a language model find the veggies in a menu? New pre-print where we investigate the internal mechanisms of LLMs when filtering on a list of options. Spoiler: turns out LLMs use strategies surprisingly similar to functional programming (think "filter" from python)! ๐Ÿงต
1
22
63
@sheridan_feucht
Sheridan Feucht
8 months
[๐Ÿ“„] Are LLMs mindless token-shifters, or do they build meaningful representations of language? We study how LLMs copy text in-context, and physically separate out two types of induction heads: token heads, which copy literal tokens, and concept heads, which copy word meanings.
2
39
194
@kevinlu4588
Kevin Lu
7 months
We also find a deep trade-off: Robust methods (destruction-based๐Ÿงจ) tend to distort unrelated generations. Understanding this helps researchers choose or design erasure methods that fit their needs.
1
0
0
@kevinlu4588
Kevin Lu
7 months
Surprisingly even โ€œstrongโ€ erasure methods that beat adversarial promptsโ€ฆ .... can still regenerate the erased concept via inpainting or noise-based probes ๐Ÿ˜ณ Our findings show: No single probe is enough. True erasure requires a holistic evaluation.
1
0
1
@kevinlu4588
Kevin Lu
7 months
Do these robust unlearning methods still somehow leave some knowledge traces? We propose a suite of probing methods to test this: ๐ŸŽฏ Adversarial attacks ๐Ÿ–ผ๏ธ Inpainting ๐ŸŒซ๏ธ Diffusion trajectory expansion ๐Ÿ“‰ Dynamic concept tracing Each reveals different residual traces ๐Ÿ‘€
1
0
2
@kevinlu4588
Kevin Lu
7 months
Similarly building on this finding that erased knowledge can be resurfaced through optimization, @koushik_srivats et al. proposed STEREO, which exhaustively searches and removes knowledge traces inside an erased diffusion model. https://t.co/mNxc3jZn4Q
@koushik_srivats
Koushik Srivatsan
8 months
โญ ๐—–๐—ฉ๐—ฃ๐—ฅ ๐Ÿฎ๐Ÿฌ๐Ÿฎ๐Ÿฑ ๐—›๐—ถ๐—ด๐—ต๐—น๐—ถ๐—ด๐—ต๐˜ โญ Excited to share that our work has been selected as a ๐—ต๐—ถ๐—ด๐—ต๐—น๐—ถ๐—ด๐—ต๐˜ ๐—ฝ๐—ฎ๐—ฝ๐—ฒ๐—ฟ ๐—ฎ๐˜ ๐—–๐—ฉ๐—ฃ๐—ฅ ๐Ÿฎ๐Ÿฌ๐Ÿฎ๐Ÿฑ, placing it in the ๐˜๐—ผ๐—ฝ ๐Ÿญ๐Ÿฏ.๐Ÿฑ% ๐—ผ๐—ณ ๐—ฎ๐—ฐ๐—ฐ๐—ฒ๐—ฝ๐˜๐—ฒ๐—ฑ ๐—ฝ๐—ฎ๐—ฝ๐—ฒ๐—ฟ๐˜€! ๐Ÿ… Details in ๐Ÿงต
1
0
0
@kevinlu4588
Kevin Lu
7 months
@mnphamx1 et al. found that erased concepts can be resurfaced by text inversion ( @RinonGal et al.), by optimizing the text input using a few images of the erased concept. They proposed task vectors as a solution for robust unlearning. https://t.co/rS3JHr7ajE
@mnphamx1
Minh Pham
2 years
Excited to share our latest work with @kellym11111 and @chegday: โ€œCircumventing Concept Erasure Methods For Text-to-Image Generative Modelsโ€ Paper: https://t.co/FKEAQQlRRN Project website: https://t.co/sPGPkG28kY Thread ๐Ÿงต
1
0
0
@kevinlu4588
Kevin Lu
7 months
Our work evaluates several concept erasure methods. ESD is a self-guided erasure method that uses the model's own knowledge of a concept to ablate it. By design, ESD directs the model to steer away from the conceptโ€™s representations. https://t.co/imCf3x7GRU
@_akhaliq
AK
3 years
Erasing Concepts from Diffusion Models @Gradio demo is out on @huggingface demo: https://t.co/sYJ0O21mN8
1
0
0
@kevinlu4588
Kevin Lu
7 months
When we "erase" a concept from a diffusion model, is that knowledge truly gone? ๐Ÿค” We investigated, and the answer is often 'no'! Using simple probing techniques, the knowledge traces of the erased concept can be easily resurfaced ๐Ÿ” Here is what we learned ๐Ÿงต๐Ÿ‘‡
1
8
33
@kevinlu4588
Kevin Lu
7 months
Our work evaluates several concept erasure methods. ESD is a self-guided erasure method that uses the model's own knowledge of a concept to ablate it. By design, ESD directs the model to steer away from the conceptโ€™s representations. https://t.co/imCf3x7GRU
@_akhaliq
AK
3 years
Erasing Concepts from Diffusion Models @Gradio demo is out on @huggingface demo: https://t.co/sYJ0O21mN8
0
0
0