
Durk Kingma
@dpkingma
Followers
49K
Following
3K
Media
48
Statuses
690
@AnthropicAI. Prev. @Google Brain/DeepMind, founding team @OpenAI. Computer scientist; inventor of the VAE, Adam optimizer, and other methods. ML PhD.
Joined March 2009
Personal news: I'm joining @AnthropicAI! 😄 Anthropic's approach to AI development resonates significantly with my own beliefs; looking forward to contributing to Anthropic's mission of developing powerful AI systems responsibly. Can't wait to work with their talented team,.
109
89
3K
It's already the case that people's free will gets hijacked by screens for hours a day, with lots of negative consequences. AI video can make this worse, since it's directly optimizable. AI video has positive uses, but most of it will be fast food for the mind.
Very impressed with Veo 3 and all the things people are finding on r/aivideo etc. Makes a big difference qualitatively when you add audio. There are a few macro aspects to video generation that may not be fully appreciated:. 1. Video is the highest bandwidth input to brain. Not.
21
33
374
Thank you! See you guys in Singapore next week 🥳.
Test of Time Winner. Adam: A Method for Stochastic Optimization.Diederik P. Kingma, Jimmy Ba. Adam revolutionized neural network training, enabling significantly faster convergence and more stable training across a wide variety of architectures and tasks.
10
5
327
👇 Great work led by Yushun (@ericzhang0410) introducing Adam-mini, a version of Adam that, surprisingly, reduces Adam's memory requirement by 50% (!), without negatively affecting convergence rates. Please read Yushun's thread for details!.
Finally finished Adam-mini! A "mini" version of Adam that painlessly frees 50% of memory over Adam. Some highlighted features:. 1. Adam-mini saves 50% memory over Adam for all modern neural nets. This is done by removing 99.9% Adam's v (but the last 0.1% of v is essential and
5
22
171
RT @sedielem: In @dpkingma and @RuiqiGao had suggested that noise augmentation could be used to make other likelih….
arxiv.org
To achieve the highest perceptual quality, state-of-the-art diffusion models are optimized with objectives that typically look very different from the maximum likelihood and the Evidence Lower...
0
26
0
Great blogpost by Ruiqi (and other GDM ex-colleagues), clearly explaining the the connection between flow matching and diffusion models. Super happy they took the time to explain this topic, there's confusion on this topic, I think many will find this quite valuable!.
A common question nowadays: Which is better, diffusion or flow matching? 🤔. Our answer: They’re two sides of the same coin. We wrote a blog post to show how diffusion models and Gaussian flow matching are equivalent. That’s great: It means you can use them interchangeably.
1
15
158
Congrats to @geoffreyhinton for getting the Nobel! His impact is immeasurable, very much deserved.
BREAKING NEWS.The Royal Swedish Academy of Sciences has decided to award the 2024 #NobelPrize in Physics to John J. Hopfield and Geoffrey E. Hinton “for foundational discoveries and inventions that enable machine learning with artificial neural networks.”
3
2
129
The recording of our talk for the ICLR'24 test-of-time award (with @wellingmax) is now available online:. Biggest live audience I've ever spoken to, with >2000 attendees 😅. But it was a lot of fun!.
6
41
315
Thanks to the ICLR Award Committee! And thank you for the kind words, Max! You were the perfect Ph.D. advisor and collaborator, kind and inspiring. I really couldn't have wished for better.
Thank you Yisong and the Award Committee for choosing the VAE for the Test of Time award. I like to congratulate Durk who was my first (brilliant) student when moving back to the Netherlands and who is the main architect of the VAE. It was absolutely fantastic working with him.
25
19
536
RT @yisongyue: Congratulations to @dpkingma and @wellingmax for winning the inaugural ICLR Test of Time Award for their amazing work on Aut….
arxiv.org
How can we perform efficient inference and learning in directed probabilistic models, in the presence of continuous latent variables with intractable posterior distributions, and large datasets?...
0
28
0