Jürgen Schmidhuber
@SchmidhuberAI
Followers
173K
Following
1K
Media
68
Statuses
149
Invented principles of meta-learning (1987), GANs (1990), Transformers (1991), very deep learning (1991), etc. Our AI is used many billions of times every day.
Switzerland, KSA
Joined August 2019
It has been said that AI is the new oil, the new electricity, and the new internet. And the once nimble and highly profitable software companies (MSFT, GOOG, ...) became like utilities, investing in nuclear energy, among other things, to run AI data centres. Open Source and the
71
290
1K
25 years later, @ylecun's 2015 slide rehashed the 1990 paper on a recurrent neural "world model" that predicts all sensory inputs including pixels and multi-dimensional reward signals & pain signals: J. Schmidhuber. Making the world differentiable: On using fully recurrent
13
28
301
My own experience with @ylecun is consistent with @GaryMarcus’s critique from yesterday: see the popular items 1-4 below (with receipts). 1. Who invented convolutional neural networks (CNNs)? Hint: it wasn't LeCun. Who first combined CNNs with backpropagation? Hint: it wasn't
74
139
2K
So it seems that as I grow older, I slowly have to choose between one of these pre-defined paths:
87
56
1K
Update (Nov 2025): Who invented knowledge distillation with artificial neural networks? Technical Note IDSIA-12-25 (easy to find on the web [5]). In 2025, the DeepSeek “Sputnik" [7] shocked the world, wiping out a trillion $ from the stock market. DeepSeek distills knowledge from
2
6
58
In 2025, the DeepSeek “Sputnik" shocked the world, wiping out a trillion $ from the stock market. DeepSeek [7] distills knowledge from one neural network (NN) into another. Who invented this? https://t.co/mDbCOpQdPI NN distillation was published in 1991 by yours truly [0].
23
21
274
2025 update: who invented Transformer neural networks (the T in ChatGPT)? Timeline of Transformer evolution in Technical Note IDSIA-11-25 (easy to find on the web): ★ 1991. Original tech report on what's now called the unnormalized linear Transformer (ULTRA)[FWP0][ULTRA].
4
5
74
@karpathy 2025 update: who invented Transformer neural networks (the T in ChatGPT)? Timeline of Transformer evolution in Technical Note IDSIA-11-25 (easy to find on the web)
4
0
34
Who Invented Transformer Neural Networks (the T in ChatGPT)? Timeline of Transformer evolution https://t.co/Ok4L0nn9Uu ★ 1991. Original tech report on what's now called the unnormalized linear Transformer (ULTRA)[FWP0][ULTRA]. KEY/VALUE was called FROM/TO. ULTRA uses outer
26
88
589
The big breakthrough for convnets was the first GPU-accelerated CUDA implementation, which immediately started winning first place in image classification competitions. Remember when that happened? I do. That was Dan Ciresan in 2011
Who invented convolutional neural networks (CNNs)? 1969: Fukushima had CNN-relevant ReLUs [2]. 1979: Fukushima had the basic CNN architecture with convolution layers and downsampling layers [1]. Compute was 100 x more costly than in 1989, and a billion x more costly than
38
145
1K
Amazing that @SchmidhuberAI gave this talk back in 2012, months before AlexNet paper was published. In 2012, many things he discussed, people just considered to be funny and a joke, but the same talk now would be considered at the center of AI debate and controversy. Full talk:
26
181
1K
Best paper award for "Mindstorms in Natural Language-Based Societies of Mind" at #NeurIPS2023 WS Ro-FoMo. Up to 129 foundation models collectively solve practical problems by interviewing each other in monarchical or democratic societies https://t.co/nQm627Zthv
26
156
1K
In 2016, at an AI conference in NYC, I explained artificial consciousness, world models, predictive coding, and science as data compression in less than 10 minutes. I happened to be in town, walked in without being announced, and ended up on their panel. It was great fun.
60
180
1K
7
6
211
Our Huxley-Gödel Machine learns to rewrite its own code, estimating its own long-term self-improvement potential. It generalizes on new tasks (SWE-Bench Lite), matching the best officially checked human-engineered agents. Arxiv 2510.21614 With @Wenyi_AI_Wang, @PiotrPiekosAI,
56
154
1K
In 1928, Lilienfeld also patented the metal oxide semiconductor FET (MOSFET) [LIL2]. Lilienfeld's designs worked as described and gave substantial gain [ARN98]. In 1934, German engineer Oskar Heil patented another FET variant [HEIL]. Two decades after Lilienfeld, researchers at
5
6
49
100 years ago, on 22 Oct 1925, the Austrian-Hungarian (since 1919 Polish) physicist Julius Edgar Lilienfeld (professor in Germany 1905-1926) patented the transistor [LIL1]. A field-effect transistor (FET). Today, almost all of the transistors in our computers are FETs. Details
6
30
215
Thank you @SchmidhuberAI for speaking in front of a packed room at ZurichAI in the @ETH_AI_Center yesterday! It's the biggest event so far, by far. Thanks everyone for coming; we're sorry for anyone who couldn't didn't get a spot. More and bigger things are planned!
7
11
122
2025 update (easy to find on the web): A Nobel Prize for Plagiarism. Technical Report IDSIA-24-24, 2024, updated Oct 2025 (26 pages, 5 illustrations, 200+ references). Abstract: Sadly, the 2024 Nobel Prize in Physics awarded to Hopfield & Hinton is effectively a prize for
3
8
60