Jürgen Schmidhuber
@SchmidhuberAI
Followers
173K
Following
1K
Media
69
Statuses
151
Invented principles of meta-learning (1987), GANs (1990), Transformers (1991), very deep learning (1991), etc. Our AI is used many billions of times every day.
Switzerland, KSA
Joined August 2019
It has been said that AI is the new oil, the new electricity, and the new internet. And the once nimble and highly profitable software companies (MSFT, GOOG, ...) became like utilities, investing in nuclear energy, among other things, to run AI data centres. Open Source and the
72
291
1K
2025 updates, with replies to some of the comments, and links to the original papers: Re: 1. Self-supervised learning - LeCun’s 2022 paper rehashes but doesn’t cite essential work of 1990-2015: https://t.co/wGnLB9EvRC
https://t.co/Mm4mtHq5CY Re: 2. Who invented deep residual
Lecun (@ylecun)’s 2022 paper on Autonomous Machine Intelligence rehashes but doesn’t cite essential work of 1990-2015. We’ve already published his “main original contributions:” learning subgoals, predictable abstract representations, multiple time scales… https://t.co/Mm4mtHq5CY
6
8
95
2025 updates, with replies to some of the comments, and links to the original papers: Re: 1. Self-supervised learning - LeCun’s 2022 paper rehashes but doesn’t cite essential work of 1990-2015: https://t.co/wGnLB9EvRC
https://t.co/Mm4mtHq5CY Re: 2. Who invented deep residual
Lecun (@ylecun)’s 2022 paper on Autonomous Machine Intelligence rehashes but doesn’t cite essential work of 1990-2015. We’ve already published his “main original contributions:” learning subgoals, predictable abstract representations, multiple time scales… https://t.co/Mm4mtHq5CY
6
8
95
@ylecun 2025 update: who invented convolutional neural networks (CNNs)? Hint: it wasn't LeCun. Who first combined CNNs with backpropagation? Hint: it wasn't LeCun. Who first applied backprop-CNNs to character recognition in 1988? Hint: it wasn't LeCun. https://t.co/eickYlgubd
Who invented convolutional neural networks (CNNs)? 1969: Fukushima had CNN-relevant ReLUs [2]. 1979: Fukushima had the basic CNN architecture with convolution layers and downsampling layers [1]. Compute was 100 x more costly than in 1989, and a billion x more costly than
11
6
79
25 years later, @ylecun's 2015 slide rehashed the 1990 paper on a recurrent neural "world model" that predicts all sensory inputs including pixels and multi-dimensional reward signals & pain signals: J. Schmidhuber. Making the world differentiable: On using fully recurrent
18
34
366
My own experience with @ylecun is consistent with @GaryMarcus’s critique from yesterday: see the popular items 1-4 below (with receipts). 1. Who invented convolutional neural networks (CNNs)? Hint: it wasn't LeCun. Who first combined CNNs with backpropagation? Hint: it wasn't
73
143
2K
So it seems that as I grow older, I slowly have to choose between one of these pre-defined paths:
87
56
1K
Update (Nov 2025): Who invented knowledge distillation with artificial neural networks? Technical Note IDSIA-12-25 (easy to find on the web [5]). In 2025, the DeepSeek “Sputnik" [7] shocked the world, wiping out a trillion $ from the stock market. DeepSeek distills knowledge from
2
7
58
In 2025, the DeepSeek “Sputnik" shocked the world, wiping out a trillion $ from the stock market. DeepSeek [7] distills knowledge from one neural network (NN) into another. Who invented this? https://t.co/mDbCOpQdPI NN distillation was published in 1991 by yours truly [0].
23
22
273
2025 update: who invented Transformer neural networks (the T in ChatGPT)? Timeline of Transformer evolution in Technical Note IDSIA-11-25 (easy to find on the web): ★ 1991. Original tech report on what's now called the unnormalized linear Transformer (ULTRA)[FWP0][ULTRA].
4
6
74
@karpathy 2025 update: who invented Transformer neural networks (the T in ChatGPT)? Timeline of Transformer evolution in Technical Note IDSIA-11-25 (easy to find on the web)
4
1
34
Who Invented Transformer Neural Networks (the T in ChatGPT)? Timeline of Transformer evolution https://t.co/Ok4L0nn9Uu ★ 1991. Original tech report on what's now called the unnormalized linear Transformer (ULTRA)[FWP0][ULTRA]. KEY/VALUE was called FROM/TO. ULTRA uses outer
26
89
591
The big breakthrough for convnets was the first GPU-accelerated CUDA implementation, which immediately started winning first place in image classification competitions. Remember when that happened? I do. That was Dan Ciresan in 2011
Who invented convolutional neural networks (CNNs)? 1969: Fukushima had CNN-relevant ReLUs [2]. 1979: Fukushima had the basic CNN architecture with convolution layers and downsampling layers [1]. Compute was 100 x more costly than in 1989, and a billion x more costly than
39
146
1K
Amazing that @SchmidhuberAI gave this talk back in 2012, months before AlexNet paper was published. In 2012, many things he discussed, people just considered to be funny and a joke, but the same talk now would be considered at the center of AI debate and controversy. Full talk:
26
182
1K
Best paper award for "Mindstorms in Natural Language-Based Societies of Mind" at #NeurIPS2023 WS Ro-FoMo. Up to 129 foundation models collectively solve practical problems by interviewing each other in monarchical or democratic societies https://t.co/nQm627Zthv
26
157
1K
In 2016, at an AI conference in NYC, I explained artificial consciousness, world models, predictive coding, and science as data compression in less than 10 minutes. I happened to be in town, walked in without being announced, and ended up on their panel. It was great fun.
60
181
1K
7
7
211
Our Huxley-Gödel Machine learns to rewrite its own code, estimating its own long-term self-improvement potential. It generalizes on new tasks (SWE-Bench Lite), matching the best officially checked human-engineered agents. Arxiv 2510.21614 With @Wenyi_AI_Wang, @PiotrPiekosAI,
56
155
1K
In 1928, Lilienfeld also patented the metal oxide semiconductor FET (MOSFET) [LIL2]. Lilienfeld's designs worked as described and gave substantial gain [ARN98]. In 1934, German engineer Oskar Heil patented another FET variant [HEIL]. Two decades after Lilienfeld, researchers at
5
7
49
100 years ago, on 22 Oct 1925, the Austrian-Hungarian (since 1919 Polish) physicist Julius Edgar Lilienfeld (professor in Germany 1905-1926) patented the transistor [LIL1]. A field-effect transistor (FET). Today, almost all of the transistors in our computers are FETs. Details
6
31
215