clarelyle Profile Banner
Clare Lyle Profile
Clare Lyle

@clarelyle

Followers
3K
Following
1K
Media
24
Statuses
207

Did RL that year when it wasn’t cool. Currently at Google DeepMind, formerly @OATML_Oxford.

Oxford
Joined October 2012
Don't wanna be here? Send us removal request.
@clarelyle
Clare Lyle
3 days
It also works well to close the generalization gap in warm-starting settings, again provided the learning rate increase is large enough to affect various measures of feature-learning in the network.
Tweet media one
1
0
1
@grok
Grok
5 days
Generate videos in just a few seconds. Try Grok Imagine, free for a limited time.
358
642
3K
@clarelyle
Clare Lyle
3 days
We show that this trick can basically induce grokking on-demand whenever you do it when training on modular arithmetic datasets, provided you increase the learning rate high enough to observe nontrivial feature-learning metrics.
Tweet media one
1
0
1
@clarelyle
Clare Lyle
3 days
Since several works have already shown that setting a sufficiently high learning rate is critical for feature-learning, we investigated this question with a trivially simple approach: just re-warm the (effective) learning rate when you want to learn new features.
1
0
2
@clarelyle
Clare Lyle
3 days
By contrast, when a network groks it *succeeds* in over-writing bad randomly initialized features with ones that generalize well. This raises an obvious question: can we use grokking to test if a method succeeds at feature-learning, then apply it to mitigate primacy bias?.
1
0
1
@clarelyle
Clare Lyle
3 days
A common problem in continual/nonstationary problems is that features learned on early data can interfere with learning on later data, resulting in worse generalization as the network fails to overwrite the bad learned features.
1
0
1
@clarelyle
Clare Lyle
3 days
What do grokking and plasticity have in common? We show in our @CoLLAs_Conf paper that the same underlying mechanisms that facilicate grokking can help to mitigate primacy bias in nonstationary settings. In other words, if you can grok, you can continually learn!
Tweet media one
1
1
37
@clarelyle
Clare Lyle
3 days
Messed up copy pasting twitter handles and forgot to mention @axlewandowski !.
0
0
1
@clarelyle
Clare Lyle
4 days
🚨 New paper alert! 🚨 Rafał Surdej 🎤, .@m_bortkiewicz, and @MatOstasze managed to get trainable activation functions to work in nonstationary learning problems and they actually help 🤩
Tweet media one
@MatOstasze
Mateusz Ostaszewski
4 days
🚀 Excited to announce our paper "Balancing Expressivity and Robustness: Constrained Rational Activations for RL" will be an *oral* at #CoLLAs2025!. We study how trainable rational activations boost expressivity in RL but can also harm stability:
Tweet media one
1
1
28
@clarelyle
Clare Lyle
6 days
In particular, it has a bunch of really insightful experiments and a clean way of doing resets that can be helpful for improving generalization and diagnosing whether plasticity is a bottleneck for your agents. Paper link:
openreview.net
Non-stationarity can arise in Reinforcement Learning (RL) even in stationary environments. For example, most RL algorithms collect new data throughout training, using a non-stationary behaviour...
2
0
16
@clarelyle
Clare Lyle
6 days
PSA: if you work on plasticity loss you should read "Transient Non-stationarity and Generalisation in Deep Reinforcement Learning" by Igl et al. It's super relevant but suffers from an unfortunate lack of SEO due to predating the "plasticity loss" nomenclature.
3
6
84
@clarelyle
Clare Lyle
3 months
RT @MarlosCMachado: 📢 I'm very excited to release AgarCL, a new evaluation platform for research in continual reinforcement learning‼️. Rep….
0
20
0
@clarelyle
Clare Lyle
5 months
Wow did not expect such an enthusiastic response! Unfortunately I've received way too many emails to reply to everyone individually but really appreciate everyone who's taken the time to send in an application.
1
0
14
@clarelyle
Clare Lyle
5 months
📣📣 My team at Google DeepMind is hiring a student researcher for summer/fall 2025 in Seattle! If you're a PhD student interested in getting deep RL to (finally) work reliably in interesting domains, apply at the link below and reach out to me via email so I know you aplied👇
Tweet media one
7
75
625
@clarelyle
Clare Lyle
1 year
RT @khimya: Presented "Disentangling the Causes of Plasticity Loss in Neural Networks" today @CoLLAs_Conf work led by @clarelyle with Zeyu….
drive.google.com
0
3
0
@clarelyle
Clare Lyle
1 year
RT @_rockt: I am really excited to reveal what @GoogleDeepMind's Open Endedness Team has been up to 🚀. We introduce Genie 🧞, a foundation….
0
552
0
@clarelyle
Clare Lyle
2 years
Violets are blue.Roses are red.Missed ICML?.Try CoLLAs instead :).
@CoLLAs_Conf
CoLLAs 2025
2 years
Roses are Red,.Violets are Blue;.Don’t Forget to Submit,.(Tomorrow) on OpenReview!
Tweet media one
0
6
27
@clarelyle
Clare Lyle
2 years
RT @CoLLAs_Conf: Roses are Red,.Violets are Blue;.Don’t Forget to Submit,.(Tomorrow) on OpenReview!
Tweet media one
0
10
0
@clarelyle
Clare Lyle
2 years
RT @pcastr: 📢Mixtures of Experts unlock parameter scaling for deep RL!. Adding MoEs, and in particular Soft MoEs, to value-based deep RL ag….
0
57
0