DPalenicek Profile Banner
Daniel Palenicek Profile
Daniel Palenicek

@DPalenicek

Followers
268
Following
207
Media
11
Statuses
49

PhD Researcher in Robot #ReinforcementLearning 🤖🧠 at @ias_tudarmstadt and @Hessian_AI advised by @Jan_R_Peters. Former intern: @Bosch_AI and @Huawei R&D UK

Darmstadt, Germany
Joined July 2020
Don't wanna be here? Send us removal request.
@DPalenicek
Daniel Palenicek
1 year
Super excited that CrossQ got accepted at @iclr_conf! 🎉 We show how to effectively use #BatchNorm in #RL, yielding SOTA sample efficiency while staying as computationally efficient as SAC!. This is joined work with @aditya_bhatt 🧵.#ICLR2024
Tweet media one
6
12
65
@DPalenicek
Daniel Palenicek
1 month
@JoeMWatson @Jan_R_Peters @Hessian_AI @ias_tudarmstadt @CS_TUDarmstadt @DFKI If you're working on RL stability, plasticity, or sample efficiency, this might be relevant for you. We'd love to hear your thoughts and feedback!. Come talk to us at RLDM in June in Dublin (.
0
1
3
@DPalenicek
Daniel Palenicek
1 month
1
1
2
@DPalenicek
Daniel Palenicek
1 month
📚 TL;DR: We combine BN + WN in CrossQ for stable high-UTD training .and SOTA performance on challenging RL benchmarks. No need for network resets, no critic ensembles, no other tricks. Simple regularization, big gains.
1
0
0
@DPalenicek
Daniel Palenicek
1 month
0
1
2
@DPalenicek
Daniel Palenicek
1 month
👋 We'd love to hear your thoughts and feedback!. Come talk to us at RLDM in June in Dublin (. If you're working on RL stability, plasticity, or sample efficiency, this might be relevant for you.
1
0
1
@DPalenicek
Daniel Palenicek
1 month
⚖️ Simpler ≠ Weaker: Compared to SOTA baselines like BRO our method:.✅ Needs 90% fewer parameters (~600K vs. 5M).✅ Avoids parameter resets.✅ Scales stably with compute. We also compare strongly to the concurrent SIMBA algorithm. No tricks—just principled normalization. ✨.
1
0
0
@DPalenicek
Daniel Palenicek
1 month
🔬 The Result: CrossQ + WN scales reliably with increasing UTD—no more resets, no critic ensembles, no other tricks. We match or outperform SOTA on 25 continuous control tasks from DeepMind Control Suite & MyoSuite, including dog 🐕 and humanoid🧍‍♂️tasks across UTDS.
Tweet media one
1
0
0
@DPalenicek
Daniel Palenicek
1 month
➡️ With growing weight norm, the effective learning rate decreases, and learning slows down/stops. 💡Solution: After each gradient update, we rescale parameters to the unit sphere, preserving plasticity and keeping the effective learning rate stable.
Tweet media one
1
0
0
@DPalenicek
Daniel Palenicek
1 month
🧠Key Idea: BN improves sample efficiency, but fails to reliably scale with complex tasks & high UTDs due to growing weight norms. However, BN regularized networks are scale invariant w.r.t. their weights; yet, the gradient scales inversely proportional (Van Laarhoven 2017).
1
0
0
@DPalenicek
Daniel Palenicek
1 month
🔍 Background: Off-policy RL methods like CrossQ (Bhatt* & Palenicek* et al. 2024) are sample-efficient but struggle to scale to high update-to-data (UTD) ratios. We identify why scaling CrossQ fails—and fix it with a surprisingly effective tweak: Weight Normalization (WN). 🏋️
Tweet media one
1
0
0
@DPalenicek
Daniel Palenicek
1 month
🚀 New preprint "Scaling Off-Policy Reinforcement Learning with Batch and Weight Normalization"🤖. We propose CrossQ+WN, a simple yet powerful off-policy RL for more sample-efficiency and scalability to higher update-to-data ratios. 🧵 #RL @ias_tudarmstadt.
1
8
51
@DPalenicek
Daniel Palenicek
1 month
RT @NicoBohlinger: ⚡️ Can one policy control 1000 different robots? 🤖. We explore Embodiment Scaling Laws: Training on more diverse robot e….
0
7
0
@DPalenicek
Daniel Palenicek
2 months
RT @onclk_: I am happy to share that our work ‘DIME: Diffusion-Based Maximum Entropy Reinforcement Learning’ has been accepted to ICML 2025….
0
6
0
@DPalenicek
Daniel Palenicek
4 months
Check out our latest work, where we train an omnidirectional locomotion policy directly on a real quadruped robot in just a few minutes based on our CrossQ RL algorithm 🚀.Shoutout to @NicoBohlinger, Jonathan Kinzel and @MabRobotics. @ias_tudarmstadt @CS_TUDarmstadt @Hessian_AI.
@NicoBohlinger
Nico Bohlinger
4 months
⚡️ Do you think training robot locomotion needs large scale simulation? Think again!. Our new paper shows how to train an omnidirectional locomotion policy directly on a real quadruped robot in just a few minutes 🚀.Top speeds of 0.85 m/s, two different control approaches, indoor
0
2
10
@DPalenicek
Daniel Palenicek
4 months
RT @aditya_bhatt: So l made an ultra low-cost (~50$) exosuit for humanoid teleoperation. ⏱️ Low latency streaming. 🦾 Low gear ratio r….
0
44
0
@DPalenicek
Daniel Palenicek
8 months
RT @theo_gruner: We present our preliminary results on “Analysing the Interplay of Vision and Touch for Dexterous Insertion Tasks” tomorrow….
0
6
0
@DPalenicek
Daniel Palenicek
1 year
If you want to learn about #CrossQ come to our poster session 298 at #ICLR2024 happening right now. Here with @aditya_bhatt and @_bbelousov. @Hessian_AI @ias_tudarmstadt @CS_TUDarmstadt @iclr_conf
Tweet media one
0
5
27
@DPalenicek
Daniel Palenicek
1 year
RT @aditya_bhatt: Introducing CrossQ, just published at #ICLR2024! 🎉. CrossQ achieves:.🔥 Very fast off-policy Deep RL.📈 with SOTA sample-ef….
0
16
0
@DPalenicek
Daniel Palenicek
1 year
RT @Hessian_AI: 🚀 Meet the future of AI: @DPalenicek and @theo_grune82772, PhD students at @hessian_ai & @TUDarmstadt, shaping breakthrough….
0
3
0
@DPalenicek
Daniel Palenicek
1 year
Thanks for having us, this was a lot of fun! ☺️.
@hbouammar
Haitham Bou Ammar
1 year
Today in our Mate with a fantastic researcher, we had @DPalenicek and Aditya Bhatt teaching us all about their excellent new #ICLR2024 spotlight paper! . CrossQ: Batch Normalization in Deep Reinforcement Learning for Greater Sample Efficiency and Simplicity! . Guess what? You
Tweet media one
0
0
4