
Richard Sutton
@RichardSSutton
Followers
54K
Following
120
Media
33
Statuses
355
Student of mind and nature, libertarian, chess player, cancer survivor. @ Keen, UAlberta, Amii, https://t.co/u8za2Kod54, The Royal Society, Turing Award
Edmonton, Alberta, Canada
Joined October 2010
AI researchers seek to understand intelligence well enough to create beings of greater intelligence than current humans. Reaching this profound intellectual milestone will enrich our economies and challenge our societal institutions. It will be unprecedented and
60
139
887
More on LLMs, RL, and the bitter lesson, on the Derby Mill podcast.
Are LLMs Bitter Lesson pilled? @RichardSSutton says "no" @m_sendhil @suzannegildert @shulgan
https://t.co/HKN9BNYg24
5
18
229
@RichardSSutton @dwarkesh_sp no you didn't misspoke there, Richard. I miss quoted, the video and caption says " training " itself. Apologies, š
0
1
27
This is a reasonable take on the podcast. One thing I would add is that people underestimate just how much babies learn as opposed to what they are born with. One of the big differences between us and other animals might just be that we rely much more on learning because we have
Finally had a chance to listen through this pod with Sutton, which was interesting and amusing. As background, Sutton's "The Bitter Lesson" has become a bit of biblical text in frontier LLM circles. Researchers routinely talk about and ask whether this or that approach or idea
21
28
289
This will age poorly. I largely have an optimistic view of LLMs; I use multiple LLM tools daily, and I don't think the LLM tech stack is a bubbleāit will create a lot of value. I disagree that the length of tasks that LLMs can do has been doubling every 7 months. There are tasks
As a researcher at a frontier lab Iām often surprised by how unaware of current AI progress public discussions are. I wrote a post to summarize studies of recent progress, and what we should expect in the next 1-2 years: https://t.co/B7438Z9lOF
54
70
829
This is a thoughtful writeup, as I expect from Rod Brooks. I think he is right on the importance of input representation, touch, and physical safety in deployment. I also think he underestimates the potential for representation and subgoal discovery with reinforcement learning.
I have just finished and just published some weekend reading for you. 9,600 words of not easy reading, on why today's humanoid robots won't learn to be dexterous.
3
5
24
Two things can be true at the same time: 1. Without additional advances, LLMs won't get us to general intelligence. 2. Even without additional advances, LLMs will radically transform the economy.
60
140
761
Dwarkesh and I had a frank exchange of views. I hope we moved the conversation forward. Dwarkesh is a true gentleman.
.@RichardSSutton, father of reinforcement learning, doesnāt think LLMs are bitter-lesson-pilled. My steel man of Richardās position: we need some new architecture to enable continual (on-the-job) learning. And if we have continual learning, we don't need a special training
80
232
4K
Mike is a powerful thinker and researcher. Very well deserved.
Amii Fellow and Canada @CIFAR_News AI Chair Michael Bowling was appointed to Canada's AI Strategy Task Force. We're incredibly proud to see Michael's expertise recognized at this level. Congratulations on a well-deserved appointment! Read: https://t.co/GZwYksNX2O
0
2
102
Is scale all you need? Or is there still a role for incorporating domain knowledge and inductive bias? While I was in Heidelberg, I took some time to write a short essay on this question called "The Bittersweet Lesson". https://t.co/DQEItqXomF
#HLF25
2
9
110
Thanks to @the_logic for including Amii in this deep-dive into Edmonton's thriving AI ecosystem. The article highlights our world-class work in RL and features Cam Linke, @RichardSSutton, and many other brilliant minds. Read:
thelogic.co
DeepMindās arrival in Alberta put it on the AI map. When it left, some feared it would be a major blow to the provinceās ambitions in the sector. It wasnāt.
0
1
12
For those really into it, here are another 50 minutes of my views on planning and action selection in options-based AI agents (like in the Oak architecture). https://t.co/B2vqxKofDW
13
59
497
Have people seen this prescient 2001 post by @RichardSSutton on self-verification? "An AI system can create and maintain knowledge only to the extent that it can verify that knowledge itself". This sentiment underpins much LLM reasoning research today. https://t.co/gK3DOwqYm8
1
4
27
Designing a robot at Keen that is robust enough for online and continual reinforcement learning was fun. Robotics is so much easier when you take a scalable approach like learning instead of using human-designed sims or using data from human operation.
The audience for this is small, but we have an open source repository for the āPhysical Atariā work we did at Keen. Working purely in the physical world is a huge burden compared to simulation, but it is important to have a reasonable grasp of the gap between the two. The
1
4
43
This is a big deal. It is the first large-scale demonstration of the advantage of real-time reinforcement learning. The recipe is scalable and requires no intervention in principle; the model can adapt forever as long as it is being used. There is no way to achieve similar
We've trained a new Tab model that is now the default in Cursor. This model makes 21% fewer suggestions than the previous model while having a 28% higher accept rate for the suggestions it makes. Learn more about how we improved Tab with online RL.
8
19
310
Evan Solomon is Canadaās new first minister of AI (and digital innovation)
Today in Edmonton, we announced new federal support to strengthen Canadaās AI compute infrastructureāgiving researchers and innovators the tools they need to drive the next great discovery. We also announced new support to help Canadian workers gain the skills they need for the
2
2
51
Maximizing voluntary decision-making is the answer
freedom/human flourishing canāt be found in a two party state. there is no singular perfect governance system - people deserve radical optionality, just like in everything else we should have thousands of political options to choose, join and exit, like we should have endless
5
8
131
The Pandemonium paper is seminal, but a little hard to find; here is a pdf: https://t.co/4wAm8jU0vt
@RichardSSutton I recently had the pleasure of having to read Selfridge's Pandemonium. What an amazing mind he had.
6
26
252
Dwarkesh Patel is 100% right on this: AI's utility is very strongly dependent on continual learning. https://t.co/YR54QlaqZK
50
135
2K