
Jürgen Schmidhuber
@SchmidhuberAI
Followers
152K
Following
695
Media
49
Statuses
100
Invented principles of meta-learning (1987), GANs (1990), Transformers (1991), very deep learning (1991), etc. Our AI is used many billions of times every day.
Joined August 2019
The #NobelPrizeinPhysics2024 for Hopfield & Hinton rewards plagiarism and incorrect attribution in computer science. It's mostly about Amari's "Hopfield network" and the "Boltzmann Machine." . 1. The Lenz-Ising recurrent architecture with neuron-like elements was published in.
212
1K
5K
Thanks @elonmusk for your generous hyperbole! . Admittedly, however, I didn’t invent sliced bread, just #GenerativeAI and things like that: And of course my team is standing on the shoulders of giants: Original tweet by @elonmusk:
114
457
5K
The #NobelPrize in Physics 2024 for Hopfield & Hinton turns out to be a Nobel Prize for plagiarism. They republished methodologies developed in #Ukraine and #Japan by Ivakhnenko and Amari in the 1960s & 1970s, as well as other techniques, without citing the original inventors.
96
426
3K
The GOAT of tennis @DjokerNole said: "35 is the new 25.” I say: “60 is the new 35.” AI research has kept me strong and healthy. AI could work wonders for you, too!
159
165
2K
Quarter-century anniversary: 25 years ago we received a message from N(eur)IPS 1995 informing us that our submission on LSTM got rejected. (Don’t worry about rejections. They mean little.) #NeurIPS2020
6
278
2K
Re: The (true) story of the "attention" operator . that introduced the Transformer . by @karpathy. Not quite! The nomenclature has changed, but in 1991, there was already what is now called an unnormalized linear Transformer with "linearized self-attention" [TR5-6]. See (Eq.
The (true) story of development and inspiration behind the "attention" operator, the one in "Attention is All you Need" that introduced the Transformer. From personal email correspondence with the author @DBahdanau ~2 years ago, published here and now (with permission) following
54
285
2K
Train a weight matrix to encode the backpropagation learning algorithm itself. Run it on the neural net itself. Meta-learn to improve it! Generalizes to datasets outside of the meta-training distribution. v4 2022 with @LouisKirschAI
11
205
1K
25th anniversary of the LSTM at #NeurIPS2021. reVIeWeR 2 - who rejected it from NeurIPS1995 - was thankfully MIA. The subsequent journal publication in Neural Computation has become the most cited neural network paper of the 20th century:
13
155
1K
Lecun (@ylecun)’s 2022 paper on Autonomous Machine Intelligence rehashes but doesn’t cite essential work of 1990-2015. We’ve already published his “main original contributions:” learning subgoals, predictable abstract representations, multiple time scales…
31
179
1K
Yesterday @nnaisense released EvoTorch (, a state-of-the-art evolutionary algorithm library built on @PyTorch, with GPU-acceleration and easy training on huge compute clusters using @raydistributed. (1/2).
8
198
1K
Best paper award for "Mindstorms in Natural Language-Based Societies of Mind" at #NeurIPS2023 WS Ro-FoMo. Up to 129 foundation models collectively solve practical problems by interviewing each other in monarchical or democratic societies
22
138
893
Unlike diffusion models, Bayesian Flow Networks operate on the parameters of data distributions, rather than on noisy versions of the data itself. I think this paper by Alex Graves et al. will be influential.
📣 BFNs: A new class of generative models that.- brings together the strengths of Bayesian inference and deep learning.- trains on continuous, discretized or discrete data with simple end-to-end loss.- places no restrictions on the network architecture.
11
133
860
Yet another award for plagiarism. Of all the papers that could have won the #NeurIPS2024 Test of Time Award, it had to be the #NeurIPS 2014 paper on "Generative Adversarial Networks" [GAN1]. This is the notorious paper that republished the 1990 principle of Artificial Curiosity
31
188
766
@geoffreyhinton Hinton should be stripped of his awards for plagiarism and misattribution:.
The #NobelPrize in Physics 2024 for Hopfield & Hinton turns out to be a Nobel Prize for plagiarism. They republished methodologies developed in #Ukraine and #Japan by Ivakhnenko and Amari in the 1960s & 1970s, as well as other techniques, without citing the original inventors.
77
42
659
In 2010, we used Jensen Huang's @nvidia GPUs to show that deep feedforward nets can be trained by plain backprop without any unsupervised pretraining. In 2011, our DanNet was the first superhuman CNN. Today, compute is 100+ times cheaper, and NVIDIA 100+ times more valuable.
5
50
610
What can we learn from history? The FACTS: a novel Structured State-Space Model with a factored, thinking memory [1]. Great for forecasting, video modeling, autonomous systems, at #ICLR2025. Fast, robust, parallelisable. [1] Li Nanbo, Firas Laakom, Yucheng Xu, Wenyi Wang, J.
12
137
609
Our #GPTSwarm models Large Language Model Agents and swarms thereof as computational graphs reflecting the hierarchical nature of intelligence. Graph optimization automatically improves nodes and edges.
20
115
599
Stop crediting the wrong people for inventions made by others. At least in science, the facts will always win in the end. As long as the facts have not yet won, it is not yet the end. No fancy award can ever change that. #selfcorrectingscience #plagiarism
19
155
583
2010 foundations of recent $NVDA stock market frenzy: our simple but deep neural net on @nvidia GPUs broke MNIST Things are changing fast. Just 7 months ago, I tweeted: compute is 100x cheaper, $NVDA 100x more valuable. Today, replace "100" by "250."
17
74
521
1/3: “On the binding problem in artificial neural networks” with Klaus Greff and @vansteenkiste_s. An important paper from my lab that is of great relevance to the ongoing debate on symbolic reasoning and compositional generalization in neural networks:
5
98
501
To be clear, I'm very impressed by #DeepSeek's achievement of bringing life to the dreams of the past. Their open source strategy has shown that the most powerful large-scale AI systems can be something for the masses and not just for the privileged few. It's a pleasure to see.
61
47
502
What if AI could write creative stories & insightful #DeepResearch reports like an expert? Our heterogeneous recursive planning [1] enables this via adaptive subgoals [2] & dynamic execution. Agents dynamically replan & weave retrieval, reasoning, & composition mid-flow. Explore
15
150
484
Instead of trying to defend his paper on OpenReview (where he posted it), @ylecun made misleading statements about me in popular science venues. I am debunking his recent allegations in the new Addendum III of my critique
15
63
454
GANs are special cases of Artificial Curiosity (1990) and also closely related to Predictability Minimization (1991). Now published in Neural Networks 127:58-66, 2020. #selfcorrectingscience #plagiarism.Open Access: Preprint:
9
80
448
@NobelPrize Sorry to rain on your parade. Sadly, the Nobel Prize in Physics 2024 for Hopfield & Hinton turns out to be a Nobel Prize for plagiarism. They republished methodologies developed in #Ukraine and #Japan by Ivakhnenko and Amari in the 1960s & 1970s, as well as other techniques,.
14
37
420
With Kazuki Irie and @robert_csordas at #ICML2022: any linear layer trained by gradient descent is a key-value/attention memory storing its entire training experience. This dual form helps us visualize how neural nets use training patterns at test time
5
82
389
KAUST (17 full papers at #NeurIPS2021) and its environment are now offering huge resources to advance both fundamental and applied AI research. We are hiring outstanding professors, postdocs, and PhD students:
6
82
370
KAUST, the university with the highest impact per faculty, has 24 papers #NeurIPS2022. Visit Booth#415 of the @AI_KAUST Initiative! We are hiring on all levels.
11
31
371
1/3 century anniversary of thesis on #metalearning (1987). For its cover I drew a robot that bootstraps itself. 1992-: gradient descent-based neural metalearning. 1994-: meta-RL with self-modifying policies. 2003-: optimal Gödel Machine. 2020: new stuff!
2
58
351
Re: 2024 #NobelPrize Debacle. The President of the #NeurIPS Foundation (overseeing the ongoing #NeurIPS2024 conference) was a student of Hopfield, and a co-author of Hinton (1985) [BM]. He is also known for sending "amicus curiae" ("friend of the court") letters to award.
The #NobelPrize in Physics 2024 for Hopfield & Hinton turns out to be a Nobel Prize for plagiarism. They republished methodologies developed in #Ukraine and #Japan by Ivakhnenko and Amari in the 1960s & 1970s, as well as other techniques, without citing the original inventors.
14
144
326
2021: Directing AI Initiative at #KAUST, university with highest impact per faculty. Keeping current affiliations. Hiring on all levels. Great research conditions. Photographed dolphin on a snorkeling trip off the coast of KAUST
12
51
342
30-year anniversary of #Planning & #ReinforcementLearning with recurrent #WorldModels and #ArtificialCuriosity (1990). Also: high-dimensional reward signals, deterministic policy gradients, #GAN principle, and even simple #Consciousness & #SelfAwareness
2
61
340
90th anniversary of Kurt Gödel's 1931 paper which laid the foundations of theoretical computer science, identifying fundamental limitations of algorithmic theorem proving, computing, AI, logics, and math itself (just published in FAZ @faznet 16/6/2021)
3
68
325
I am hiring postdocs at #KAUST to develop an Artificial Scientist for the discovery of novel chemical materials to save the climate by capturing carbon dioxide. Join this project at the intersection of RL and Material Science:
18
149
297
10-year anniversary: Deep Reinforcement Learning with Policy Gradients for LSTM. Applications: @DeepMind’s Starcraft player; @OpenAI's dextrous robot hand & Dota player - @BillGates called this a huge milestone in advancing AI #deeplearning
4
56
304
10-year anniversary of our deep multilayer perceptrons trained by plain gradient descent on GPU, outperforming all previous methods on a famous benchmark. This deep learning revolution quickly spread from Europe to North America and Asia. #deeplearning
3
73
299
15-year anniversary: first paper with "learn deep" in the title (2005). On deep #ReinforcementLearning & #NeuroEvolution solving problems of depth 1000 and more. 1st author: Faustino Gomez! #deeplearning #deepRL
0
51
281
Some people have lost their titles or jobs due to plagiarism, e.g., Harvard's former president. But after this #NobelPrizeinPhysics2024, how can advisors now continue to tell their students that they should avoid plagiarism at all costs? Of course, it is well known that.
10
24
252
@NobelPrize Sadly, the #NobelPrize in Physics 2024 for Hopfield & Hinton is a Nobel Prize for plagiarism. They republished methodologies developed in #Ukraine and #Japan by Ivakhnenko and Amari in the 1960s & 1970s, as well as other techniques, without citing the original papers. Even in.
10
23
196
@RichardSSutton Some background to reinforcement learning in Sec. 17 of the "Annotated History of Modern AI and Deep Learning:"
1
11
131
@hardmaru This was accepted at ICML 2022. Thanks to Kazuki Irie, Imanol Schlag, and Róbert Csordás!.
3
1
103
@goodfellow_ian As mentioned in Sec. B1 of reference [DLP]:. The priority dispute above was picked up by the popular press, e.g., Bloomberg [AV1], after a particularly notable encounter between me and Bengio's student Dr. @goodfellow_ian at a N(eur)IPS conference. He gave a talk on GANs,.
2
2
91
@goodfellow_ian "Self-aggrandizement" says the researcher who claims he invented GANs :-) See references [PLAG1-7] in the original tweet, for example, [PLAG6]: "May it be accidental or intentional, plagiarism is still plagiarism." Unintentional plagiarists must correct their publications.
5
5
83
@goodfellow_ian As mentioned in Sec. B1 of reference [DLP]: the priority dispute above was picked up by the popular press, e.g., Bloomberg [AV1], after a particularly notable encounter between me and Bengio's student Dr. @goodfellow_ian at a N(eur)IPS conference. He gave a talk on GANs,.
5
1
61
@NobelPrize At the risk of beating a dead horse: sadly, the #NobelPrize in Physics 2024 for Hopfield & Hinton is a Nobel Prize for plagiarism. They republished methodologies developed in #Ukraine and #Japan by Ivakhnenko and Amari in the 1960s & 1970s, as well as other techniques, without.
0
4
56
@goodfellow_ian Again: ad hominem arguments against facts 🙂 See [DLP, Sec. 4] on ad hominem attacks [AH1-3] true to the motto: "If you cannot dispute a fact-based message, attack the messenger himself" . "unlike politics, however, science is immune to ad hominem attacks" . "in the hard.
6
1
46
@goodfellow_ian "Self-aggrandizement" says the researcher who claims he invented GANs :-) See references [PLAG1-7] in the original tweet, for example, [PLAG6]: "May it be accidental or intentional, plagiarism is still plagiarism." Unintentional plagiarists must correct their papers.
10
2
38
@goodfellow_ian I could spend more time answering in detail, but all the answers are actually in the original tweet and its references
Yet another award for plagiarism. Of all the papers that could have won the #NeurIPS2024 Test of Time Award, it had to be the #NeurIPS 2014 paper on "Generative Adversarial Networks" [GAN1]. This is the notorious paper that republished the 1990 principle of Artificial Curiosity
1
0
36
@goodfellow_ian Again: ad hominem against facts. See [DLP, Sec. 4]: ". conducted ad hominem attacks [AH2-3] against me true to the motto: 'If you cannot dispute a fact-based message, attack the messenger himself'" . "unlike politics, however, science is immune to ad hominem attacks — at.
4
4
27
@goodfellow_ian I could spend more time answering in detail, but all the answers are actually in the original tweet and its references
Yet another award for plagiarism. Of all the papers that could have won the #NeurIPS2024 Test of Time Award, it had to be the #NeurIPS 2014 paper on "Generative Adversarial Networks" [GAN1]. This is the notorious paper that republished the 1990 principle of Artificial Curiosity
0
1
19