Giannis Daras Profile
Giannis Daras

@giannis_daras

Followers
3,893
Following
402
Media
125
Statuses
1,232
Explore trending content on Musk Viewer
Pinned Tweet
@giannis_daras
Giannis Daras
1 month
Consistent Diffusion Meets Tweedie. Our latest paper introduces an exact framework to train/finetune diffusion models like Stable Diffusion XL solely with noisy data. A year's worth of work breakthrough in reducing memorization and its implications on copyright 🧵
Tweet media one
18
68
405
@giannis_daras
Giannis Daras
2 years
DALLE-2 has a secret language. "Apoploe vesrreaitais" means birds. "Contarra ccetnxniams luryca tanniounons" means bugs or pests. The prompt: "Apoploe vesrreaitais eating Contarra ccetnxniams luryca tanniounons" gives images of birds eating bugs. A thread (1/n)🧵
Tweet media one
205
2K
9K
@giannis_daras
Giannis Daras
2 years
Another example: "Two whales talking about food, with subtitles". We get an image with the text "Wa ch zod rea" written on it. Apparently, the whales are actually talking about their food in the DALLE-2 language. (4/n)
Tweet media one
15
181
2K
@giannis_daras
Giannis Daras
2 years
The discovery of the DALLE-2 language creates many interesting security and interpretability challenges. Currently, NLP systems filter text prompts that violate the policy rules. Gibberish prompts may be used to bypass these filters. (6/n)
14
87
1K
@giannis_daras
Giannis Daras
2 years
A known limitation of DALLE-2 is that it struggles with text. For example, the prompt: "Two farmers talking about vegetables, with subtitles" gives an image that appears to have gibberish text on it. However, the text is not as random as it initially appears... (2/n)
Tweet media one
28
178
1K
@giannis_daras
Giannis Daras
2 years
We feed the text "Vicootes" from the previous image to DALLE-2. Surprisingly, we get (dishes with) vegetables! We then feed the words: "Apoploe vesrreaitars" and we get birds. It seems that the farmers are talking about birds, messing with their vegetables! (3/n)
Tweet media one
12
92
1K
@giannis_daras
Giannis Daras
2 years
We wrote a small paper with @AlexGDimakis summarizing our findings. Please find the paper here: Arxiv version coming soon. (7/n, n=7).
19
61
1K
@giannis_daras
Giannis Daras
2 years
Some words from the DALLE-2 language can be learned and used to create absurd prompts. For example, "painting of Apoploe vesrreaitais" gives a painting of a bird. "Apoploe vesrreaitais" means to the model "something that flies" and can be used across diverse styles. (5/n)
Tweet media one
8
68
1K
@giannis_daras
Giannis Daras
2 years
An update on the hidden vocabulary of DALLE-2. While a lot of the feedback we received was constructive, some of the comments need to be addressed. A thread, with some new gibberish text and some discussion 🧵 (1/N)
15
76
617
@giannis_daras
Giannis Daras
2 years
Announcing Soft Diffusion: A framework to correctly schedule, learn and sample from general diffusion processes. State-of-the-art results on CelebA, outperforms DDPMs and vanilla score-based models. A 🧵to learn about Soft Score Matching, Momentum Sampling and the role of noise
Tweet media one
5
69
458
@giannis_daras
Giannis Daras
2 years
Based on valid comments, we updated our paper with a discussion on Limitations and changed the title to Discovering the Hidden Vocabulary of DALLE-2. Thanks to @mraginsky @rctatman @benjamin_hilton and others for useful comments.
5
17
402
@giannis_daras
Giannis Daras
10 months
Dear NeurIPS Reviewers, This is me writing my rebuttal in a beach in Ikaria, Greece. I would appreciate it if you could read it 🥺 Best, Giannis
Tweet media one
8
8
347
@giannis_daras
Giannis Daras
1 year
Stable Diffusion and other text-to-image models sometimes blatantly copy from their training images. We introduce Ambient Diffusion, a framework to train/finetune diffusion models given only *corrupted* images as input. This reduces the memorization of the training set. A 🧵
Tweet media one
9
55
294
@giannis_daras
Giannis Daras
8 months
Today was my first day as a Research Scientist Intern at NVIDIA 🥳 Will be working with @ArashVahdat and the team on some pretty exciting research directions around generative models in the coming months 👌 Looking forward to it!
12
3
193
@giannis_daras
Giannis Daras
1 month
This week I successfully passed my Ph.D. proposal 🎉 The title of the talk: "Generative Models from Lossy Measurements". Here is a little 🧵 about it
Tweet media one
19
8
183
@giannis_daras
Giannis Daras
11 months
Solving inverse problems (e.g. inpainting/deblurring) for general domain images is hard🤷‍♂️ Magic Eraser and other tools use separately trained models for each task. We introduce PSLD, a method that uses Stable Diffusion to solve all linear problems without any extra training.
Tweet media one
1
34
165
@giannis_daras
Giannis Daras
3 years
New paper: "Intermediate Layer Optimization for Inverse Problems using Deep Generative Models". Paper: Code: Colab: Below a video of the Mona Lisa with inpainted eyes and a thread🧵
2
31
149
@giannis_daras
Giannis Daras
1 year
Multiresolution Textual Inversion. Given a few images, we learn pseudo-words that represent a concept at different resolutions. "A painting of a dog in the style of <jane(number)>" gives different levels of artistic freedom to match the <jane> style based on the number index.
Tweet media one
2
23
145
@giannis_daras
Giannis Daras
2 years
However, "Apodidae Ploceidae" (two names of real bird families) indeed gives 10/10 birds. Therefore, one possible explanation is that our gibberish tokens are mashups of parts of real words. This seems reasonable. It is interesting that DALLE-2 generates those mashups. (6/N)
3
7
136
@giannis_daras
Giannis Daras
2 years
We want to emphasize that this is an adversarial attack and hence does not need to work all the time. If a system behaves in an unpredictable way, even if that happens 1/10 times, that is still a massive security and interpretability issue, worth understanding. (10/N, N=10).
5
5
133
@giannis_daras
Giannis Daras
2 years
Is it possible to reconstruct 3-D geometry of a face from a single photo? This requires solving an inverse problem for a NerfGAN, but previous methods create artifacts as shown. During my Google internship, we developed a new method to solve this problem. A thread 🧵(1/n)
Tweet media one
5
10
117
@giannis_daras
Giannis Daras
5 years
Excited to announce our paper: Your Local GAN. Paper: Code: We obtain 14.53% FID ImageNet improvement on SAGAN by only changing the attention layer. We introduce a new sparse attention layer with 2-D locality. Thread: 1/n
@AlexGDimakis
Alex Dimakis
5 years
New paper: Your Local GAN: a new layer of two-dimensional sparse attention and a new generative model. Also progress on inverting GANs which may be useful for inverse problems. with @giannis_daras from NTUA and @gstsdn @Han_Zhang_ from @googleai
1
23
142
4
30
112
@giannis_daras
Giannis Daras
3 months
Does having a better generator always lead to better priors for inverse problems? (hint: no!) Diffusion models trained with only corrupted data can outperform models trained on clean data for several image restoration tasks🤯 Here is the story behind our new paper Ambient DPS👇
Tweet media one
1
12
108
@giannis_daras
Giannis Daras
2 years
Responses to some of the criticism can be found here:
@giannis_daras
Giannis Daras
2 years
An update on the hidden vocabulary of DALLE-2. While a lot of the feedback we received was constructive, some of the comments need to be addressed. A thread, with some new gibberish text and some discussion 🧵 (1/N)
15
76
617
4
7
104
@giannis_daras
Giannis Daras
2 years
@BarneyFlames , @mattgroh pointed out that "Apoploe", our gibberish word for birds, has similar BPE encoding to "Apodidae". Interestingly, "Apodidae" produces ~1/10 birds (but many flying insects), while our gibberish "Apoploe" gives 10/10. (5/N)
Tweet media one
Tweet media two
2
3
99
@giannis_daras
Giannis Daras
2 years
@benjamin_hilton said that we got lucky with the whales example. We found another similar example. "Two men talking about soccer, with subtitles" gives the word "tiboer". This seems to give sports in ~4/10 images. (2/N)
Tweet media one
Tweet media two
Tweet media three
3
3
94
@giannis_daras
Giannis Daras
8 months
Ambient Diffusion got accepted to NeurIPS 2023 🥳 Useful for training/finetuning generative models in applications where access to uncorrupted data is expensive or undesirable (because of memorization). Very excited about this research direction. See you all in New Orleans! 🎷
@giannis_daras
Giannis Daras
1 year
Stable Diffusion and other text-to-image models sometimes blatantly copy from their training images. We introduce Ambient Diffusion, a framework to train/finetune diffusion models given only *corrupted* images as input. This reduces the memorization of the training set. A 🧵
Tweet media one
9
55
294
4
10
87
@giannis_daras
Giannis Daras
2 years
A few people, including @realmeatyhuman , asked whether our method works beyond natural images (of birds, etc). Yes, we found some examples that seem statistically significant. E.g. "doitcdces" seems related (~4/10 images) to students (or learning). (3/N)
Tweet media one
Tweet media two
4
2
77
@giannis_daras
Giannis Daras
2 years
Our hidden vocabulary seems robust in easy and sometimes neutral prompts but not in hard ones. These tokens may produce low confidence in the generator and small perturbations move it in random directions. "vicootes" means vegetables in some contexts and not in others. (9/N)
Tweet media one
3
1
73
@giannis_daras
Giannis Daras
2 years
NeurIPS, New Orleans 🎷🎶
Tweet media one
3
1
76
@giannis_daras
Giannis Daras
6 months
Excited to be at NeurIPS 2023, presenting some papers we have been working on over the last few months 🎯 The first work is Consistent Diffusion Models 😊
Tweet media one
1
10
73
@giannis_daras
Giannis Daras
2 years
Similarly, "comafuruder" seems correlated (~4/10) to sickness/hospitals/patients. (4/N)
Tweet media one
Tweet media two
4
2
68
@giannis_daras
Giannis Daras
1 month
Stable Diffusion XL and other state-of-the-art models memorize examples from their training sets. We discover that SDXL can reconstruct images from LAION even when whole faces or objects are missing. Row 1: Images from LAION, Row 2: Masked Input to SDXL, Row 3: Reconstruction
Tweet media one
2
19
70
@giannis_daras
Giannis Daras
4 years
Excited to announce our #NeurIPS2020 paper: SMYRF: Efficient Attention using Asymmetric Clustering. Paper: Code: We propose a novel way to approximate *pre-trained* attention layers or train from scratch.
1
10
69
@giannis_daras
Giannis Daras
2 years
New ICML paper: Score-Guided Intermediate Layer Optimization (SGILO). We train diffusion models on the latent space of StyleGAN and we show provable mixing of Langevin Dynamics for random generators. Reconstructions for *extremely sparse* (<1%) measurements. A thread🧵(1/N)
Tweet media one
2
13
67
@giannis_daras
Giannis Daras
2 years
Our gibberish tokens might have many meanings. @benjamin_hilton run "Contarra ccetnxniams luryca tanniounons" and pointed out that not all are bugs. Indeed, our gibberish text produces a statistically significant fraction, but rarely a 100% match to the target concept. (7/N)
Tweet media one
3
2
62
@giannis_daras
Giannis Daras
28 days
Consistent Diffusion Meets Tweedie: Training Exact Ambient Diffusion Models with Noisy Data. Accepted to ICML 2024 🥳 Come and meet me in Vienna to learn about how to train/finetune diffusion models with noisy data 🚧
Tweet media one
@giannis_daras
Giannis Daras
1 month
Consistent Diffusion Meets Tweedie. Our latest paper introduces an exact framework to train/finetune diffusion models like Stable Diffusion XL solely with noisy data. A year's worth of work breakthrough in reducing memorization and its implications on copyright 🧵
Tweet media one
18
68
405
1
1
60
@giannis_daras
Giannis Daras
2 years
Our gibberish tokens have varying degrees of robustness in combinations with contexts. E.g. if xx produces birds, ‘xx flying’ is an easy prompt ‘xx on a table’ is a neutral prompt, and ‘xx in space’ is a hard prompt. (8/N)
1
2
60
@giannis_daras
Giannis Daras
10 months
This is me (not) preparing hard for our ICML poster session this week in Hawaii 🌴 Can’t wait for the conference this week. As always, please reach out to talk about diffusion models, inverse problems, surfing and other equally fun topics 🏄‍♂️
Tweet media one
1
2
58
@giannis_daras
Giannis Daras
5 years
Slides from my talk at #spaCyIRL regarding sparse attention factorizations are available here: … Thanks for the massive interest, paper and code are going to be released soon. Slides partially describe joint work with: @georgepar_91 @AlexGDimakis @apotam
2
14
53
@giannis_daras
Giannis Daras
2 years
Me waiting for #NeurIPS2022 decisions, 1 minute after the expected announcement.
2
2
50
@giannis_daras
Giannis Daras
2 years
@benjamin_hilton Finally, as noted by many, this is far from a language. It lacks grammar, syntax, coherence and many other things. We changed the title to: "Discovering the Hidden Vocabulary of DALLE-2" and we made the limitations explicit in the paper. Thanks for all the feedback!
1
0
44
@giannis_daras
Giannis Daras
1 year
Introducing CommonPool the largest collection of image-text pairs, 2.5x the size of LAION. A 1.4B subset of our pool outcompetes compute-matched CLIP models from OpenAI and LAION. DataComp, a new benchmark for multimodal datasets. is here!
@gabriel_ilharco
Gabriel Ilharco
1 year
Introducing DataComp, a new benchmark for multimodal datasets! We release 12.8B image-text pairs, 300+ experiments and a 1.4B subset that outcompetes compute-matched CLIP runs from OpenAI & LAION 📜 🖥️ 🌐
Tweet media one
8
190
779
2
9
45
@giannis_daras
Giannis Daras
7 months
📢: Tomorrow, at 12:30 Central Time, I am giving a talk at UW-Madison. I will present two accepted papers at NeurIPS 2023 🥳: Consistent Diffusion Models (not to be confused with Consistency Models🤷‍♂️) and Ambient Diffusion. Feel free to join us remotely or in person 👇
2
4
45
@giannis_daras
Giannis Daras
1 month
We open-source our code to enable further research in this area. We are excited to see how this work is going to be used to mitigate memorization and in applications where data is inherently noisy. If you have ideas, ping me, would love to colab!
2
1
42
@giannis_daras
Giannis Daras
8 months
Great diffusion paper from Mauricio ( @2ptmvd ) and Peyman ( @docmilanfar ). Highly recommended read!
@_akhaliq
AK
11 months
Inversion by Direct Iteration: An Alternative to Denoising Diffusion for Image Restoration abs: Inversion by Direct Iteration (InDI) is a new formulation for supervised image restoration that avoids the so-called “regression to the mean” effect and
Tweet media one
0
17
98
0
5
40
@giannis_daras
Giannis Daras
1 year
Cognitive science research indicates that bilingualism reduces the rate of cognitive decline. Does this happen in neural networks too? We train monolingual and bilingual GPT models and we show that the bilingual's performance decays slower under various weight corruptions.
@AlexGDimakis
Alex Dimakis
1 year
Human bilinguals are more robust to dementia and cognitive decline. In our recent NeurIPS paper we show that bilingual GPT models are also more robust to structural damage in their neuron weights. Further, we develop a theory.. (1/n)
19
230
2K
1
3
39
@giannis_daras
Giannis Daras
2 years
Happening now, in person! My Ph.D advisor, @AlexGDimakis , turns Prof. @StefanoErmon into a frog using our algorithm, Intermediate Layer Optimization. Many interesting points on fairness and modularity of algorithms that use deep generative models.
Tweet media one
0
1
34
@giannis_daras
Giannis Daras
5 years
I was training pytorch model on multiple gpus, getting out of memory due to single gpu loss computation. This amazing gist written by @Thom_Wolf is a nice and clean workaround. Check it out, if you haven't already.
0
14
35
@giannis_daras
Giannis Daras
8 months
Great paper by my lunch buddy at Google Research last summer! Congrats @mengweir 🎉💥
@docmilanfar
Peyman Milanfar
8 months
Great job @mengweir on your paper and poster - your hard work really paid off - Congrats! + thanks to the very capable social media chair @CSProfKGD for the great photo
Tweet media one
2
5
74
1
2
31
@giannis_daras
Giannis Daras
3 years
My 2020: Graduated from @ntua , started a Ph.D. at @UTCompSci working with @AlexGDimakis , moved from Greece to the US, got my first papers accepted at @CVPR , @NeurIPSConf , got an exciting internship offer for summer 21, and created wonderful memories with friends & family.
1
0
31
@giannis_daras
Giannis Daras
1 year
We can apply our method to learn to represent any concept, given only a few images. Here is an example of generating Grand Theft Auto (GTA) artwork at different resolutions. The GTA artwork concept was learned with only 4 input images.
Tweet media one
1
5
29
@giannis_daras
Giannis Daras
1 month
This memorization behavior has led to a series of lawsuits against the research labs that developed these models. We develop the first framework to train/finetune diffusion models with noisy data. The model generates high-quality images without ever seeing a clean image 🤯
Tweet media one
3
5
30
@giannis_daras
Giannis Daras
1 year
Deterministic diffusion samplers (e.g. DDIM) can efficiently sample from any distribution given an estimation of the underlying score function! We also show how to extend the DDIM idea to any diffusion (linear or non-linear), similar to Soft/Cold Diffusion samplers.
@sitanch
Sitan Chen
1 year
To appear at ICML ’23 We obtain non-asymptotic convergence bounds for *deterministic* diffusion model samplers, as well as a new operational interpretation for the probability flow ODE 🏖 1/7
Tweet media one
1
12
52
1
7
30
@giannis_daras
Giannis Daras
4 years
Thrilled to share with you that our paper, Your Local GAN, got accepted in @CVPR ! I feel grateful that my first ever paper as an undergrad of @ntua got accepted in such a conference. This work is the result of an awesome collaboration with @AlexGDimakis , @gstsdn @Han_Zhang_ .
@giannis_daras
Giannis Daras
5 years
Excited to announce our paper: Your Local GAN. Paper: Code: We obtain 14.53% FID ImageNet improvement on SAGAN by only changing the attention layer. We introduce a new sparse attention layer with 2-D locality. Thread: 1/n
4
30
112
1
3
29
@giannis_daras
Giannis Daras
2 years
@benjamin_hilton I think there are three concerns in this thread: 1) gibberish texts don't have 1-1 mappings with English texts, 2) the meaning of gibberish texts changes, when the context is changing and 3) The attack method doesn't always work. (1/N)
2
0
27
@giannis_daras
Giannis Daras
1 month
SDXL can further reconstruct training images given heavily noisy measurements. The reconstruction task is not important. What matters is that these models have memorized their training set.
Tweet media one
2
6
28
@giannis_daras
Giannis Daras
1 month
When I joined the lab, I asked @AlexGDimakis to name a few of his past Ph.D. students who impressed him the most. I won't disclose the full answer, but I will say one thing: @DimitrisPapail was among the top in this (short) list. So, thank you @DimitrisPapail , means a lot!
@DimitrisPapail
Dimitris Papailiopoulos
1 month
There's a distinct sense of pride when your academic siblings thrive during their PhDs and beyond. Although you're not directly involved, you still feel incredibly proud of their successes. Hook 'em horns
0
0
30
1
0
27
@giannis_daras
Giannis Daras
1 month
Recipe to finetune without memorizing: 1) Take your dataset and encode it to latent space using SDXL. 2) Add (a lot of) noise to the latents. 3) Use our training objective to fine-tune your diffusion model. As you increase the dataset noise, memorization gets reduced.
Tweet media one
1
3
27
@giannis_daras
Giannis Daras
2 years
@rctatman The criticism here is very fair. We changed the "Secret Language" to "Hidden Vocabulary" in the title and we added a section on Limitations in our paper. Thanks for the constructive feedback!
1
2
27
@giannis_daras
Giannis Daras
11 months
We open-source our code and we launch an online demo for anyone to try it. Code: Demo:
1
2
26
@giannis_daras
Giannis Daras
1 year
Tomorrow (Friday, May 5), at 12pm PT, I am giving a talk at the Grundfest Memorial Lecture Series. I will talk about recent work in the intersection of Generative Models and Computational Imaging. Join us (online) to hear about diffusion models for and from inverse problems!
Tweet media one
1
1
26
@giannis_daras
Giannis Daras
2 years
So excited for my first in-person Ph.D. talk. Come join us this Friday, it will be fun!
@MLFoundations
Institute for Foundations of Machine Learning
2 years
Giannis Daras @giannis_daras discusses "Generative Models for Reconstruction, Art and Things in Between: A short introduction to Intermediate Layer Optimization" FRIDAY, 4/22, at the @UTAustin Machine Learning Lab Research Symposium. Register today:
Tweet media one
0
1
5
1
1
25
@giannis_daras
Giannis Daras
2 years
Continuing the trend @roydanroy ´s student started. Time to teach these scammers some math 🤣
@AlexGDimakis
Alex Dimakis
2 years
Someone is trying to scam my PhD student. My student asks to verify their identity 1/2
Tweet media one
46
430
4K
1
0
26
@giannis_daras
Giannis Daras
1 month
Our method uses a double application of Tweedie's formula and a consistency loss function that allows us to extend sampling at noise levels below the observed data noise. This is the first method that trains exact models using noisy data, solving an open-problem in this space.
Tweet media one
1
1
24
@giannis_daras
Giannis Daras
5 months
NeurIPS 2023 Christmas present: 🎁
Tweet media one
4
0
22
@giannis_daras
Giannis Daras
2 years
@Plinz I agree with this thread. I don't believe there is anything "cryptic", probably the word "secret" in the title is more clickbait than it should have been. That said, the realization that "random" strings map to consistent visual concepts creates many security challenges.
1
0
22
@giannis_daras
Giannis Daras
3 years
Exciting personal news: Today is the first day of my Research Internship at @Google ! I will be working with Abhishek Kumar ( @studentofml ), Vincent Chu and Dmitry Lagun ( @DmitryLagun ) on NERF-related research ideas. @googlestudents
0
1
22
@giannis_daras
Giannis Daras
2 years
@benjamin_hilton 1) Indeed, a gibberish text can mean more than one thing. But this is also true for words in English (homonyms). Also, DALLE-2 text might mean resemble clusters of things. For example, we found that the word "comafuruder" has something to do with hospitals/ doctors/illness.
Tweet media one
Tweet media two
2
0
19
@giannis_daras
Giannis Daras
2 years
Here is the link to our thread, talking about this work in a little more detail:
@giannis_daras
Giannis Daras
2 years
Announcing Soft Diffusion: A framework to correctly schedule, learn and sample from general diffusion processes. State-of-the-art results on CelebA, outperforms DDPMs and vanilla score-based models. A 🧵to learn about Soft Score Matching, Momentum Sampling and the role of noise
Tweet media one
5
69
458
0
3
19
@giannis_daras
Giannis Daras
2 years
Amazed by the simplicity and the elegance of the code from the paper: "Elucidating the Design Space of Diffusion-Based Generative Models". Extremely easy to experiment with different ideas for sampling from diffusion models.
0
3
18
@giannis_daras
Giannis Daras
2 years
A side benefit of our approach: significant computational benefits. Deblurring (with little noise) seems to be a more efficient operation compared to denoising for image generation.
Tweet media one
1
1
18
@giannis_daras
Giannis Daras
2 years
@awjuliani good question, we don't know, but it would be interesting to explore! As many mentioned, we discovered that there is a gibberish vocabulary, not a gibberish language. We currently have no evidence proving (or disproving) that there is some sort of syntax/grammar.
1
1
18
@giannis_daras
Giannis Daras
2 years
Really excited about NeurIPS next week! I will be in New Orleans from Nov. 27 to Dec. 4. If you want to chat about generative models, diffusion, sampling, inverse problems, or any other cool related research topic, please reach out!
1
0
16
@giannis_daras
Giannis Daras
2 years
@ArthurB I see some consistency in the generated outputs. It seems to me entirely possible that Midjourney has its' own vocabulary - a set of words that seem random to humans but are consistently mapped to visual concepts. Let us know if you find any!
1
1
16
@giannis_daras
Giannis Daras
2 years
@benjamin_hilton 3) The attack method doesn't always work. This is true -- we did a couple of runs to get this working. However, it works *sometimes* and it is interesting that the model is revealing its' adversarial examples. Another example: "Two men talking about soccer, with subtitles".
Tweet media one
Tweet media two
2
0
15
@giannis_daras
Giannis Daras
2 years
This gets even more interesting.
@Merzmensch
Merzmensch Kosmopol🧑‍🎨🤖
2 years
By deleting several other letters we get even more weird result. Usually DALL·E delivers crisp images with perfect composition. Here we see birds being masked out.
Tweet media one
3
5
53
0
2
16
@giannis_daras
Giannis Daras
1 month
Research paper: This is a work that I have been working on for a year under the supervision of two wonderful mentors: Alex Dimakis ( @AlexGDimakis ) and Constantinos Daskalakis ( @KonstDaskalakis ).
1
0
15
@giannis_daras
Giannis Daras
1 month
These issues, motivate training with corrupted samples. More data & less memorization of the training set. But is it possible to train diffusion models that generate clean images without ever seeing one? 🤔 Our framework, Ambient Diffusion (NeurIPS 2023), solves this problem.
Tweet media one
2
2
14
@giannis_daras
Giannis Daras
2 years
Ingredient 2: Momentum Sampling We show that the choice of sampler has a dramatic effect on the quality of the generated samples. We propose Momentum Sampler, a novel sampling scheme to reverse general linear corruption processes, inspired by momentum methods in optimization.
Tweet media one
1
0
15
@giannis_daras
Giannis Daras
7 months
Interesting work on increasing the diffusion sampling diversity
@GabriCorso
Gabriele Corso
7 months
New paper!🤗 Do all your samples from Stable Diffusion or Dall-E look very similar to each other? It turns out IID sampling is to blame! We study this problem and propose Particle Guidance, a technique to obtain diverse samples that can be readily applied to your diffusion model!
Tweet media one
4
87
440
1
1
15
@giannis_daras
Giannis Daras
2 years
@benjamin_hilton 2) Yes, gibberish text changes meaning based on the context (but not always). I do not yet understand when/why this is happening -- but I think it is worth exploring.
2
0
13
@giannis_daras
Giannis Daras
2 years
Video generation coming together. The pace of progress is insane.
@TomLikesRobots
TomLikesRobots🤖
2 years
Cool use of video init's using #DeforumDiffusion / #stablediffusion from u/EsdricoXD on Reddit. prompt: A film still of lalaland, artwork by studio ghibli, makoto shinkai, pixv sampler: euler ancestral Steps: 45 scale: 14 strength: 0.55 Coherent and really Effective 🔥
21
226
1K
0
4
14
@giannis_daras
Giannis Daras
3 years
Excited to share that our paper, ILO, got accepted to ICML! If you haven't had the chance already, read our work and/or play with the demo -- it's fun! Camera-ready version and follow-up work coming soon! Congrats to everyone that submitted to ICML and good luck for NeurIPS!
@giannis_daras
Giannis Daras
3 years
New paper: "Intermediate Layer Optimization for Inverse Problems using Deep Generative Models". Paper: Code: Colab: Below a video of the Mona Lisa with inpainted eyes and a thread🧵
2
31
149
0
2
13
@giannis_daras
Giannis Daras
2 years
Had a lot of fun talking today at MLL Research Symposium! Here is a preview of some slides, made with manim.
1
1
13
@giannis_daras
Giannis Daras
6 months
Exciting first day at the IFML ( @MLFoundations ) GenAI workshop today! Some interesting discussions about open science and about how far the capabilities of GPT-N might go.
Tweet media one
1
1
13
@giannis_daras
Giannis Daras
1 year
Recent work from @tomgoldsteincs 's lab shows that diffusion models can generate exact copies or collages of training images. We study the problem of training diffusion models with corrupted data, e.g. images with 90% of their pixels missing.
Tweet media one
@tomgoldsteincs
Tom Goldstein
1 year
#StableDiffusion is being sued for copyright infringement. Our recent #CVPR2023 paper revealed that diffusion models can indeed copy from training images in unexpected situations. Let’s see what the lawsuit claims are, and if they're true or false 🧵
Tweet media one
8
67
310
1
1
12
@giannis_daras
Giannis Daras
10 months
Drop by our poster, #645 , to discuss about an algorithmic interpretation of the Probability Flow ODE that extends DDIM to non linear diffusions. Bonus: a non asymptotic analysis of deterministic diffusion samplers.
Tweet media one
@giannis_daras
Giannis Daras
10 months
Will be presenting our work on analyzing *deterministic* diffusion samplers. DDIM and other deterministic samplers are usually faster than their stochastic counterparts. We make a first attempt to understand their theoretical properties. Wed, 4-5:30pm CDT, Hall 1 #645
0
1
11
0
3
13
@giannis_daras
Giannis Daras
5 months
@AlexGDimakis 🤷‍♂️🤷‍♂️🤷‍♂️
@miniapeur
Mathieu Alain
5 months
Tweet media one
6
184
2K
1
0
13
@giannis_daras
Giannis Daras
1 year
Our key idea is to add *additional* distortion and require the model to predict the *corrupted* image from the further corrupted image. The learner has no way of knowing whether a pixel was missing or whether we corrupted it. Hence, it has to predict a clean image everywhere.
Tweet media one
2
0
13
@giannis_daras
Giannis Daras
1 month
In the second part of the Ph.D. proposal talk, we show how to extend Ambient Diffusion to the case where our data is corrupted with additive Gaussian noise. In this case, we manage to get theoretical guarantees for exact sampling, solving an open-problem in this space 🫡
Tweet media one
1
2
11
@giannis_daras
Giannis Daras
2 months
Great work!
@docmilanfar
Peyman Milanfar
2 months
We often assume bigger generative models are better. But when practical image generation is limited by compute budget is this still true? Answer is no By looking at latent diffusion models across different scales our paper sheds light on the quality vs model size tradeoffs 1/5
Tweet media one
6
59
373
0
2
12
@giannis_daras
Giannis Daras
4 years
Join me at #CVPR2020 to discuss about 2d local attention for GANs :) Zoom session for our paper starts in 1 hour (5 p.m. Seatle time) and it will last for two hours. CVPR link: Joint work with: @AlexGDimakis , @gstsdn , @Han_Zhang_ .
@giannis_daras
Giannis Daras
5 years
Excited to announce our paper: Your Local GAN. Paper: Code: We obtain 14.53% FID ImageNet improvement on SAGAN by only changing the attention layer. We introduce a new sparse attention layer with 2-D locality. Thread: 1/n
4
30
112
1
3
12