Theo_Vincent_ Profile Banner
Théo Vincent @RLC Profile
Théo Vincent @RLC

@Theo_Vincent_

Followers
309
Following
604
Media
52
Statuses
148

PhD student at @DFKI & @ias_tudarmstadt, working on RL 🤖 Previously master student at MVA @ENS_ParisSaclay & ENPC 🎓

Darmstadt, Allemagne
Joined February 2024
Don't wanna be here? Send us removal request.
@Theo_Vincent_
Théo Vincent @RLC
1 month
A big limitation of pruning methods is that you must choose the number of parameters to remove before training starts. But how can you know how much to prune?🤷. @RL_Conference, I will present Eau De Q-Network, the first RL method designed to DISCOVER the final sparsity level🔎
Tweet media one
1
5
28
@Theo_Vincent_
Théo Vincent @RLC
11 days
More details here:
@Theo_Vincent_
Théo Vincent @RLC
16 days
To increase the reward propagation in value-based RL algorithms, it is tempting to reduce the target update period🤔 . But, this makes the training unstable💔. @RL_Conference, I will present i-QN, a new method that allows faster reward propagation, while keeping stability⚡️. 👉🧵
Tweet media one
0
0
0
@Theo_Vincent_
Théo Vincent @RLC
11 days
Today, I will be presenting iterated Q-Network @RL_Conference, feel free to come by at poster #10!. My talk will be between 11:45 AM and 12:30 PM in CCIS 1-430, Track 1: RL algorithms, Deep RL.
Tweet media one
2
2
33
@Theo_Vincent_
Théo Vincent @RLC
11 days
RT @pcastr: Great #runconference @RL_Conference today (even with a little rain), join for the last one tomorrow morning, 6:30am, meet at G….
0
2
0
@Theo_Vincent_
Théo Vincent @RLC
12 days
RT @Ayushj240: Honored that our @RL_Conference paper won the Outstanding Paper Award on Empirical Reinforcement Learning Research!. 📜Mitiga….
0
9
0
@Theo_Vincent_
Théo Vincent @RLC
12 days
RT @pcastr: Very honoured that our paper was granted an outstanding paper award for scientific understanding in RL during the @RL_Conferenc….
0
7
0
@Theo_Vincent_
Théo Vincent @RLC
12 days
RT @GlenBerseth: @RL_Conference will be in Montréal next year at @UMontreal. We are looking forward to welcoming you all! Bienvenue! https:….
0
28
0
@Theo_Vincent_
Théo Vincent @RLC
14 days
Details are over here:
@Theo_Vincent_
Théo Vincent @RLC
15 days
Should we use a target network in deep value-based RL?🤔. The answer has always been YES or NO, as there are pros and cons. @RLFrameWorkshop, I will present iS-QN, a method that lies in between this binary view, collecting the pros while reducing the cons🚀
Tweet media one
0
0
0
@Theo_Vincent_
Théo Vincent @RLC
14 days
I will be presenting iS-QN at the poster session of @RLFrameWorkshop today, feel free to come and chat at poster 39!
Tweet media one
1
0
22
@Theo_Vincent_
Théo Vincent @RLC
15 days
RT @MarlosCMachado: Here's what our group will be presenting at RLC'25. * Invited Talks at Workshops:* .Tue 10:00: The Causal RL Workshop….
0
1
0
@Theo_Vincent_
Théo Vincent @RLC
15 days
Looking forward to @RL_Conference! I will be presenting 4 posters, feel free to come and exchange with me during the conference, @RLFrameWorkshop, or @ibrlworkshop🙂
Tweet media one
0
1
33
@Theo_Vincent_
Théo Vincent @RLC
15 days
RT @johanobandoc: 🧩 Curious about the foundations of RL?.Join us at the Finding the Frame Workshop @RL_Conference!. A full day of talks, pa….
0
6
0
@Theo_Vincent_
Théo Vincent @RLC
15 days
@RLFrameWorkshop 9/9 .Many thanks to my co-authors: @YogeshTrip7354, Tim Faust, Yaniv Oren, @Jan_R_Peters, and @CarloDeramo. and to the funding agencies: @ias_tudarmstadt, @TUDarmstadt, @DFKI, @Hessian_AI, @infsys_uniwue.
0
0
5
@Theo_Vincent_
Théo Vincent @RLC
15 days
@RLFrameWorkshop 8/9.Does it work on other settings?.YES, we also report results:.- with the IMPALA architecture🦓.- on offline experiments✈️.- on continuous control experiments with the Simba architecture (only on the poster)🤖. 📄👉
1
0
6
@Theo_Vincent_
Théo Vincent @RLC
15 days
@RLFrameWorkshop 7/9.By enforcing the network to learn multiple Bellman backups in parallel, iS-DQN K>1 constructs richer features💪
Tweet media one
1
0
4
@Theo_Vincent_
Théo Vincent @RLC
15 days
@RLFrameWorkshop 6/9.By adding additional heads to learn the following Bellman backups (iS-DQN K>1), iS-QN improves performance while not significantly increasing the memory footprint🚀. Note: we added a layer normalization to further increase stability.
Tweet media one
1
0
4
@Theo_Vincent_
Théo Vincent @RLC
15 days
@RLFrameWorkshop 5/9.Interestingly, the idea of sharing the last features (iS-DQN K=1) already reduces the performance gap between target-free DQN (TF-DQN) and target-based DQN (TB-DQN) on 15 Atari games by a large margin.
Tweet media one
1
0
4
@Theo_Vincent_
Théo Vincent @RLC
15 days
@RLFrameWorkshop 4/9.Then, we can utilize the target-based literature to enhance training stability. We enrich the classical TD loss with iterated Q-learning to increase the feedback on the shared layers by learning consecutive Bellman backups. This leads to iterated Shared Q-Network (iS-QN)
Tweet media one
1
0
5
@Theo_Vincent_
Théo Vincent @RLC
15 days
@RLFrameWorkshop 3/9.Our main idea is to use the last linear layer of the online network as a target network and share the rest of the features with the online network. This drastically reduces the memory footprint because only the last linear layer of the online network is stored as a copy.
Tweet media one
1
0
6
@Theo_Vincent_
Théo Vincent @RLC
15 days
@RLFrameWorkshop 2/9.Many recent works have shown that removing the target network leads to a performance decrease📉. Even methods that have been initially introduced without a target network benefit from their reintegration📈
Tweet media one
1
0
4
@Theo_Vincent_
Théo Vincent @RLC
15 days
@RLFrameWorkshop 1/9.With function approximation, bootstrapping without using a target network often leads to training instabilities. However, using a target network slows down reward propagation and doubles the memory footprint dedicated to Q-networks.
Tweet media one
1
0
5