UT Austin Professor. Researcher in Machine Learning and Information Theory. National AI Institute on the Foundations of Machine Learning (IFML) Co-director.
(1/3) We wrote a survey on Deep Learning Techniques for Inverse Problems in Imaging
We came up with a taxonomy that I think is interesting. Also discussed the whole 'what is supervised vs unsupervised' issue.
@WillettBecca
I was surprised by a talk Yejin Choi (an NLP expert) gave yesterday in Berkeley, on some surprising weaknesses of GPT4:
As many humans know, 237*757=179,409
but GPT4 said 179,289.
For the easy problem of multiplying two 3 digit numbers, they measured GPT4 accuracy being only…
Human bilinguals are more robust to dementia and cognitive decline. In our recent NeurIPS paper we show that bilingual GPT models are also more robust to structural damage in their neuron weights.
Further, we develop a theory.. (1/n)
2/ Scammer ends up improving our sample complexity bound for StyleGAN inverse problems. They teach them to do chaining arguments instead of just union bounds now, jeez.
@giannis_daras
One huge advantage of deep learning (vs classical ML models) that is not often discussed is *modularity*: One can download pre-trained models, glue them like Legos and fine tune them end-to-end because gradients flow through. (1/n)
Based on recent papers (Gpt3, Palm, dalle2, Gato, Metaformer) I am forming the opinion that maybe 'Scale is all you need', possibly even for general intelligence (?!). Just convert everything to tokens and predict the next token. (1/n)
The term Artificial Intelligence was coined by John McCarthy to avoid association with Cybernetics and specifically its pioneer Norbert Wiener who was already famous, pain to work with, and working on Cybernetics in MIT. Original quote from McCarthy's Stanford page: ... (1/n)
Here is a simple way to beat ChatGPT and any similar architecture with one Turing test question.
ChatGPT, GPT3 and all related Transformers have a finite maximum token sequence length, usually 2k to 4k tokens. (1/n)
My thoughts on the now famous Google leak doc:
1. Open source AI is winning. I agree, and that is great for the world and for a competitive ecosystem. In LLMs we're not there, but we just got OpenClip to beat openAI Clip and Stable diffusion is better than…
Probably the best 1h introduction to LLMs that I've seen. And after 20mins its not an introduction, its getting into cutting edge research updates updated up to this month. I had not heard of the data exfiltration by prompt injection or the recent…
As Information theory was becoming a 'hot' scientific trend in the 50s, Claude Shannon wrote a one-page paper advising hype *reduction*. That never happens anymore.
Claude Shannon's "The Bandwagon" (1956) is a timeless gem.
Short, one page advise and perspective on the status of the field.
"... we must keep our own house in first class order. The subject of information theory has certainly been sold, if not oversold."
I was informed that Alexander Vardy, a giant in coding theory passed away. A tragic loss for his family, UCSD and academia. Alex's many discoveries include the Polar decoding algorithm used in the 5G wireless standard, (1/3)
Ptolemy the king of Egypt wanted to learn geometry but found Euclid's book, the Elements, too difficult to study. So he asked Euclid to show him an easier way to master it. Euclid famously said "Sir, there is no royal road to geometry." This is still true a few thousand years…
# on shortification of "learning"
There are a lot of videos on YouTube/TikTok etc. that give the appearance of education, but if you look closely they are really just entertainment. This is very convenient for everyone involved : the people watching enjoy thinking they are…
Here is a very good reason why the Nyquist–Shannon sampling theorem requires that your function is low-pass before you sub-sample to downscale. If you just sub-sample without smoothing, a bad guy can place another image exactly on the pixels you sub-sample. Adversarial aliasing.
image-scaling attacks are wild
small dots added to the image on the left turns it into the image on the right when downscaled
could make auditing ML systems very tricky if you only look at the original images...
If you are a
#neurips2020
reviewer, please read the authors rebuttal and, at the very least, update your review indicating that you read it and your updated thoughts. It takes 5 minutes and its a good step towards decency. Meta-reviewers please enforce this.
The Google Gemini paper was released today and has 940 authors. I was impressed,
but then found that a recent LHC physics paper with 5,154 authors. The first nine pages describe the research and the other 24 pages list the authors and their institutions.
But that's not even the…
New neural renderer by Nvidia. The model adds fingerprints, smudges and dust and generates renders indistinguishable from real to me. Oh, and its done at *real-time!*. Can't wait to see games using this. (1/2)
We're very excited that
@UT
Austin will lead an NSF national Institute on the Foundations of Machine Learning with
@UW
,
@WichitaState
and
@MSFTResearch
Announcement:
Who first generated text with statistical methods like GPT?
In 1948 Claude Shannon wrote the landmark paper 'A Mathematical Theory of Communication'.
There, he defined and estimated the entropy of English by generating synthetic text: 'THE HEAD AND IN FRONTAL ATTACK ON (1/n)
@raj_raj88
But even fine-tuning with 1.8m multiplication examples was not able to teach it to generalize to other (3 digit) multiplications. This indicates some fundamental architecture limitation.
My student Giannis discovered that DALLE2 has a secret language. This can be used to crate absurd prompts that generate images. E.g.
''Apoploe vesrreaitais eating Contarra ccetnxniams luryca tanniounons'' generates Birds eating Bugs! We wrote a short paper on our experiments.
DALLE-2 has a secret language.
"Apoploe vesrreaitais" means birds.
"Contarra ccetnxniams luryca tanniounons" means bugs or pests.
The prompt: "Apoploe vesrreaitais eating Contarra ccetnxniams luryca tanniounons" gives images of birds eating bugs.
A thread (1/n)🧵
I really need to disagree with this statement. E.g. in my lab in UT Austin, good software engineering is useful but not the most important skill to learn. We train ML researchers on how to do research, e.g. understanding and improving landmark papers, ideally writing one.
This is probably well-known in some circles but not everywhere.
The most important skill for Research Scientists in AI (at least at
@OpenAI
) is software engineering.
Background in ML research is sometimes useful, but you can usually get away with a few landmark paper.
Amazing news: AI and Data science research center founded in Greece, €21 million funding, Led by Christos Papadimitriou,
@KonstDaskalakis
and Timos Sellis under
@athenaRICinfo
and the support of
@Greece_2021
Στην τελευταία εκδήλωση της Επιτροπής
@Greece_2021
, πριν τη λήξη του επετειακού έτους, ανακοινώσαμε τη δημιουργία της Μονάδας «Αρχιμήδης» στο Ερευνητικό Κέντρο «Αθηνά». Ενός Ινστιτούτου για την Τεχνητή Νοημοσύνη, την Επιστήμη των Δεδομένων και τους Αλγορίθμους.
New NeurIPS paper: We train a Robust CLIP encoder that produces approximate CLIP representations by seeing highly corrupted images. We can classify images by observing 2% random pixels or very blurry images better than humans.
Scott Aaronson gave an extraordinary public lecture in UT Austin's Machine Learning Lab (MLL) yesterday. Most packed auditorium I've seen. He described a taxonomy for AI alignment methods
1. Off switch!
2. Sandboxing / Isolation
3. Interpretability
4. Multiple competing /…
The
#Sora
model is indeed incredible 🤯 congratulations to the OpenAI team.
It is common for people to think that all the amazing research breakthroughs in AI (like
#Sora
) are happening inside companies like OpenAI, while universities are becoming irrelevant.
I want to highlight…
We have tried to use discriminators of GANs as regularizers, for detecting adversarial examples, for dozens of things: It NEVER works. I always think it's a great idea and then nope. 😓
While waiting for
#CVPR2022
CMT to get up again, I would like to propose a simple cryptographic solution to the big data submission problem: We only upload a SHA256 hash of our to-be-submitted pdf and then upload the committed pdf any time next week.
We develop a theory that shows how multitasking creates regularization. This is can be seen as a simple theoretical model for bilingual cognitive reserve.
Interestingly, the phenomenon appears only when the tasks are sufficiently diverse. (2/n)
Excited that our paper on deep generative models for robust MRI is featured by Amazon Science. We trained the first generative model for MRI images. Also for the first time we are competitive with supervised deep MRI methods and more robust to anatomy and measurement changes.
Time can seem to slow during an MRI scan.
#AmazonResearchAward
recipient Jonathan Tamir is developing
#machinelearning
methods to shorten exam times and extract more data from this essential — but often uncomfortable — imaging process. Find out how.
A surprising deep learning mystery:
Contrary to conventional wisdom, performance of unregularized CNNs, ResNets, and transformers is non-monotonic: improves, then gets worse, then improves again with increasing model size, data size, or training time.
A public service announcement: please upload all your papers on preprint servers like arxiv. The publisher owns final pdf THEY typeset, not the preprint pdf you submitted. If your papers are only behind a paywall you are violating funding recommendations.
Is there a doctor on the plane?
-Yes, but not that kind of doctor.
-The passenger in 36c is trying to inpaint an image using a pre-trained stable diffusion model and simply copy-pastes the inpainting observed part in place, after each iteration!
-Ok, I got this.
Interesting fact about GANs that is not as well known as it should be: Take a pre-trained GAN (eg DCGAN) and feed independent random noise to the discriminator. It is easy to tell noise is not a real image. You would expect that the discriminator will easily see this. (1/4)
We have multiple postdoc openings at the AI Institute for the Foundations of Machine Learning (IFML). Fellows can work with all IFML groups in
UT Austin, Univ. of Washington and Microsoft Research
(1/3)
DALL·E 2 and similar models are producing amazing images from Text. But can they count to five? I don't have access but when I try 'An image of five apples' on multimodalart latentdiffusion LAION-400M model I get wrong images constructed. (1/n)
Fun question in my ML midterm: Say a feature X1 is independent from the target label Y. We can always remove this feature and not lose in predictive performance.
I disagree-- many scientists will use ML algorithms in the same way they use databases, compilers and statistics today. Domain expertise and scientific insight do not go away when the tools change.
Within 10-20 years, nearly every branch of science will be, for all intents and purposes, a branch of computer science.
Computational physics, comp chemistry, comp biology, comp medicine... Even comp archeology. Realistic simulations, big data analysis, and ML everywhere
Very cool explanation of emergence, even in light of the neurips recent best paper award: even if for a single task, performance increases smoothly with more training, if a composite task requires k tasks to be correct, a phase transition appears as k grows.
I'd like to add…
1/2 Wrote blog on whether emergent abilities and grokking are a fundamental feature of deep learning, a "mirage" or both. This is partially based on the beautiful paper of
@RylanSchaeffer
,
@BrandoHablando
, and
@sanmikoyejo
that recently won the NeurIPS outstanding paper award.
New paper: Your Local GAN: a new layer of two-dimensional sparse attention and a new generative model. Also progress on inverting GANs which may be useful for inverse problems.
with
@giannis_daras
from NTUA and
@gstsdn
@Han_Zhang_
from
@googleai
We just discovered that the inpainting model in Stable Diffusion is cheating.
To clarify: Inpainting is a type of inverse problem where some missing data (pixels) must be filled in. In our testing, some of the inpaintings from the SDXL inpainting model where a little 'too…
Today the 25 National AI Research Institutes (funded by the National Science Foundation
@NSF
) are showcasing in the US Senate. Excited to be part of this event, presenting our work on generative AI in our IFML institute.
We are very excited that our first GH200 nodes have arrived in TACC for our GenAI center. Here is one.
Fun facts: NVIDIA makes GH200 'superchips' (i.e. modules), a GH200 DGX box and a GH200 rack, which are all different.
As Dan Stanzione, our TACC director, kindly explained…
ICML reviews are out. Time for people who have never served on program committees or tried to hunt down late reviewers to tell us how to solve complex problems with one weird trick.
I've been experimenting with CLIP here -- It can answer the world's greatest questions it seems: This image is classified as Persian Baklava (indeed, I certify no Greek would do that)
@docmilanfar
@CevherLIONS
@NAChristakis
8 percent of Americans believe they can beat a Gorilla or Lion in an unarmed fight. My question to such a person is: have you ever seen a Lion or Gorilla in real life?
Researchers from our NSF Institute on the Foundations of Machine Learning (IFML) win 2 outstanding paper awards at NeurIPS 2022
(after winning 2/5 awards in NeurIPS 2021 also)! Congratulations
@jaywhang_
@lschmidt3
(1/n)
The first...
Observed again this year in Neurips: The reviewers with the dumbest questions, the wrong claims that the 'proof is wrong', and complete lack of knowledge of the area, rank themselves as having the highest confidence 😅
GPT3 being helpful in proposal writing (Green text is generated from the given prompt). If a model can write its own proposal and get it funded, does that count as human-level intelligence? Or PI-level intelligence? (don't ask which one is higher).
@ccanonne_
But after that week you will have a perfect figure, EXACTLY as you wanted, and tikz code you can re-use to make other perfect figures. Plus it will be vector graphics and you can make a perfect T-shirt out if it. Yes it's a bit of a fetish.
what’s your (anonymized!) conference presentation horror story, mine was when someone gave a talk and then in the q&a it was immediately pointed out that what they thought was interesting was just a spelling convention and the whole paper was based on a misunderstanding
.. to be clear, I hope 'scale is all you need' is not true, and that new theoretical ideas that we are missing are discovered for learning. I've just been genuinely surprised by how much progress we get from raw scale. (5/n)
We theoretically analyze the case of random Gaussian task vectors and prove that multi-tasking leads to higher robustness with high probability.
Project page:
Original source: A Review of 'The Question of Artificial Intelligence', (an edited vol. by B. Bloomfield), written by John McCarthy in 1989 and appeared in Annals of the History of Computing in 1989.)
(5/5).
Phi-3 just released by Microsoft. Three small size models (3.8B, 7B and 14B) trained on highly filtered and synthetic data. They report impressive performance since the 3.8B model (trained on 3T tokens) has MMLU of 69% matching Llama3 8B, and the 7B Phi-3 model has 75% MMLU,…
I remember how much I hated object oriented programming when I was coding in C. Still, most of programming is about plumbing and gluing other people's code. Nothing enables this better than differentiable models. Abstraction, Encapsulation, Inheritance, etc for free. (2/n)
Faculty recruitment on Mt. Olympus. Weyl on possible colleagues
@the_IAS
Bohr—out of the question
Schrödinger—created the “wave” form of quantum mechanics
Heisenberg—fate tied up with that of Germany
Gödel—a very limited field
Weil—might be somewhat difficult colleague 😉
The Metaformer paper shows evidence you don't even need attention. Just MLP layers transforming tokens and any blending between them, (even pooling) every few layers. Blend more data and more tasks and it seems to learn arithmetic, generalize to new tasks, etc (2/n)
I guess the main point is that unless there is state, there is no way to pass a Turing test, just by converting the past conversation to a prompt. (6/n)
This is absolutely incredible. This Lumière brothers video from 1896 'Arrival of a Train at La Ciotat' has been upscaled by machine learning to 4k, 60fps.
src:
Credit to developers (Github code) DIAN, Topaz AI, ESRGAN, Waifu2x, DeOldify, Anime 4K
@jradavenport
@overleaf
you essentially run the python script as a shell command and pipe the output to the tex file, e.g.
\input{|python your_script.py}
There are indeed shallow and bad theory results in top ML conferences, sometimes accepted. But this should not be a blanket statement that all ML theory is bad. It's same as experimental results, that can be cherry picked etc. Good ML theory is needed to sharpen our thinking.
Hot take: Proofs in many ICML papers are a thin veneer of mathematical sophistication, meant to show the authors' refinement. They are not generally meant to contribute to math or to apply to reality but as proving membership in a tribe. A bit like an NFT.
Pythagoras of Samos applied for funding to the Greek NSF for Machine learning. Review summary at 490BC said, this 'machine learning' fad is just geometry, reject.
..that all the clever ideas (Convents, attention, better optimization, etc) are maybe changing the constants a bit, but 90% of progress comes from scale and we have seen no ceiling yet. (4/n)
''I can report that many of the earlier AI pioneers did not influence me except negatively, (..) As for myself, one of the reasons for inventing the term "Artificial intelligence" was to escape association with "cybernetics"... (2/n)
Overparametrization alone is not enough for easy learning. This paper proves that even if the ground truth is a 1-layer network, learning a classifier can require super-polynomial time to achieve small test-error.
It is worth watching this CNN video from the moment Emory Econ Professor
@CarolineFohlin
came across the violent arrest of a protester on campus and asked the police, with shock, "What are you doing?" That's all that prompted an officer to hurl her to the ground and handcuff her.
I can definitely tell you one thing related to this great clip: doing a Phd definitely brings a good amount of pain and suffering, even to the most talented ones.