Clare Lyle @clarelyle X Profile

Clare Lyle

@clarelyle

Followers

3K

Following

1K

Media

24

Statuses

207

Did RL that year when it wasn’t cool. Currently at Google DeepMind, formerly @OATML_Oxford.

Oxford

Joined October 2012

Don't wanna be here? Send us removal request.

Clare Lyle

@clarelyle

3 days

Shout-out to my fantastic co-authors @g_sokar, András György, and Razvan Pascanu. Paper link:

arxiv.org

In continual learning problems, it is often necessary to overwrite components of a neural network's learned representation in response to changes in the data stream; however, neural networks often...

0

3

Clare Lyle

@clarelyle

3 days

It also works well to close the generalization gap in warm-starting settings, again provided the learning rate increase is large enough to affect various measures of feature-learning in the network.

1

0

1

Grok

@grok

5 days

Generate videos in just a few seconds. Try Grok Imagine, free for a limited time.

358

642

3K

Clare Lyle

@clarelyle

3 days

We show that this trick can basically induce grokking on-demand whenever you do it when training on modular arithmetic datasets, provided you increase the learning rate high enough to observe nontrivial feature-learning metrics.

1

0

1

Clare Lyle

@clarelyle

3 days

Since several works have already shown that setting a sufficiently high learning rate is critical for feature-learning, we investigated this question with a trivially simple approach: just re-warm the (effective) learning rate when you want to learn new features.

1

0

2

Clare Lyle

@clarelyle

3 days

By contrast, when a network groks it *succeeds* in over-writing bad randomly initialized features with ones that generalize well. This raises an obvious question: can we use grokking to test if a method succeeds at feature-learning, then apply it to mitigate primacy bias?.

1

0

1

Clare Lyle

@clarelyle

3 days

A common problem in continual/nonstationary problems is that features learned on early data can interfere with learning on later data, resulting in worse generalization as the network fails to overwrite the bad learned features.

1

0

1

Clare Lyle

@clarelyle

3 days

What do grokking and plasticity have in common? We show in our @CoLLAs_Conf paper that the same underlying mechanisms that facilicate grokking can help to mitigate primacy bias in nonstationary settings. In other words, if you can grok, you can continually learn!

1

37

Clare Lyle

@clarelyle

3 days

Messed up copy pasting twitter handles and forgot to mention @axlewandowski !.

0

1

Clare Lyle

@clarelyle

4 days

🚨 New paper alert! 🚨 Rafał Surdej 🎤, .@m_bortkiewicz, and @MatOstasze managed to get trainable activation functions to work in nonstationary learning problems and they actually help 🤩

Mateusz Ostaszewski

@MatOstasze

4 days

🚀 Excited to announce our paper "Balancing Expressivity and Robustness: Constrained Rational Activations for RL" will be an *oral* at #CoLLAs2025!. We study how trainable rational activations boost expressivity in RL but can also harm stability:

1

28

Clare Lyle

@clarelyle

6 days

In particular, it has a bunch of really insightful experiments and a clean way of doing resets that can be helpful for improving generalization and diagnosing whether plasticity is a bottleneck for your agents. Paper link:

openreview.net

Non-stationarity can arise in Reinforcement Learning (RL) even in stationary environments. For example, most RL algorithms collect new data throughout training, using a non-stationary behaviour...

2

0

16

Clare Lyle

@clarelyle

6 days

PSA: if you work on plasticity loss you should read "Transient Non-stationarity and Generalisation in Deep Reinforcement Learning" by Igl et al. It's super relevant but suffers from an unfortunate lack of SEO due to predating the "plasticity loss" nomenclature.

3

6

84

Clare Lyle

@clarelyle

3 months

RT @MarlosCMachado: 📢 I'm very excited to release AgarCL, a new evaluation platform for research in continual reinforcement learning‼️. Rep….

0

20

0

Clare Lyle

@clarelyle

5 months

Wow did not expect such an enthusiastic response! Unfortunately I've received way too many emails to reply to everyone individually but really appreciate everyone who's taken the time to send in an application.

1

0

14

Clare Lyle

@clarelyle

5 months

application link:

google.com

Find your next job at Google — Careers at Google. Search by location, role, skills, and more.

2

1

18

Clare Lyle

@clarelyle

5 months

📣📣 My team at Google DeepMind is hiring a student researcher for summer/fall 2025 in Seattle! If you're a PhD student interested in getting deep RL to (finally) work reliably in interesting domains, apply at the link below and reach out to me via email so I know you aplied👇

7

75

625

Clare Lyle

@clarelyle

1 year

RT @khimya: Presented "Disentangling the Causes of Plasticity Loss in Neural Networks" today @CoLLAs_Conf work led by @clarelyle with Zeyu….

drive.google.com

0

3

0

Clare Lyle

@clarelyle

1 year

RT @_rockt: I am really excited to reveal what @GoogleDeepMind's Open Endedness Team has been up to 🚀. We introduce Genie 🧞, a foundation….

0

552

0

Clare Lyle

@clarelyle

2 years

Violets are blue.Roses are red.Missed ICML?.Try CoLLAs instead :).

CoLLAs 2025

@CoLLAs_Conf

2 years

Roses are Red,.Violets are Blue;.Don’t Forget to Submit,.(Tomorrow) on OpenReview!

0

6

27

Clare Lyle

@clarelyle

2 years

RT @CoLLAs_Conf: Roses are Red,.Violets are Blue;.Don’t Forget to Submit,.(Tomorrow) on OpenReview!

0

10

0

Clare Lyle

@clarelyle

2 years

RT @pcastr: 📢Mixtures of Experts unlock parameter scaling for deep RL!. Adding MoEs, and in particular Soft MoEs, to value-based deep RL ag….

0

57

0