c_voelcker Profile Banner
Claas Voelcker Profile
Claas Voelcker

@c_voelcker

Followers
1K
Following
2K
Media
57
Statuses
2K

"All models are wrong, but some are useful" "Do not disfigure the soul" - PostDoc UT Austin, PhD UofT, RL researcher unfocused on many things, he/him, 🏳️‍🌈

Toronto, Canada
Joined October 2018
Don't wanna be here? Send us removal request.
@c_voelcker
Claas Voelcker
5 days
@JMarakiii And Glenda Crips, the CEO of @VectorInst of course! (Somebody did offer her a chair, we were not being mean 😅).
0
0
2
@c_voelcker
Claas Voelcker
5 days
With the minister and the incredible hat designed by @JMarakiii
Tweet media one
1
0
5
@c_voelcker
Claas Voelcker
5 days
A truly unexpected surprise and honor to be congratulated on my PhD by @FP_Champagne during his visit to the @VectorInst today! Canadian science is booming, we just need the funding to contribute even more innovation in the future!.
1
0
18
@c_voelcker
Claas Voelcker
6 days
RT @marcel_hussing: We posted about this during ICML and some people may have missed it so here it goes again. I started my PhD fine-tuning….
0
2
0
@c_voelcker
Claas Voelcker
9 days
1
0
4
@c_voelcker
Claas Voelcker
9 days
Thesis is already up online for curious folks (will get a nicer home soon) and then its time to say goodbye 😭 to @UofT and @VectorInst and move on to @UTAustin where I will join @PeterStone_TX and @yayitsamyzhang for more fun with RL, robots, and more 🎉!.
Tweet card summary image
github.com
Contribute to cvoelcker/phd-thesis development by creating an account on GitHub.
2
1
22
@c_voelcker
Claas Voelcker
9 days
Huge shout-out to @SoloGen and @igilitschenski for putting up with me, my relentless skepticism, and hand-wavy ideas for so many years! Thanks to @WilCunningham @florian_shkurti and Philip Thomas for letting me get away with my thesis 😁.
1
0
6
@c_voelcker
Claas Voelcker
9 days
But now, finally, it is done! You may now all call me Dr. Claas (and then immediately laugh at me for being pretentious enough to use the title). I am super happy/relieved/exhausted to announce that I passed my thesis defense yesterday!. #PhDone #mltwitter
Tweet media one
@c_voelcker
Claas Voelcker
8 months
@QueerinAI Not yet a Dr., don’t jinx it 😁.
21
1
81
@c_voelcker
Claas Voelcker
16 days
RT @TabulaRobot: Announcing our EXAIT@ICML workshop paper: CURATE!. Have a difficult target task distribution with sparse rewards that you….
0
6
0
@c_voelcker
Claas Voelcker
17 days
Find all this and more in our preprint and please, give us feedback! We want to make this algorithm as useful for all of you as possible!.
Tweet card summary image
arxiv.org
Score-function policy gradients have delivered strong results in game-playing, robotics and language-model fine-tuning. Yet its high-variance often undermines training stability. On the other...
0
0
9
@c_voelcker
Claas Voelcker
17 days
Big shoutout to the great team that made this work Axel Brunnbauer, @marcel_hussing, @mic_nau with supervisors @pabbeel, Radu Grosu, @EricREaton, @SoloGen, @igilitschenski 🎉 All of them have thought deeply about how to make value learning work, and it shows!.
1
0
7
@c_voelcker
Claas Voelcker
17 days
The result is REPPO, which trains as fast as PPO, without replay buffers, and with minimal hyperparameter tuning. If you don't believe us, take our code and test it! We provide implementations in both jax and torch (but jax is faster 😜):
Tweet media one
Tweet media two
1
0
8
@c_voelcker
Claas Voelcker
17 days
By building on a few crucial building blocks, such as maximum entropy RL, trust regions, and modern critic architectures, we can leverage the massive stability improvements that come with first-order policy gradients without suffering from suboptimal asymptotic performance.
Tweet media one
Tweet media two
Tweet media three
1
0
5
@c_voelcker
Claas Voelcker
17 days
🔥🚨 Preprint alert: Relative Entropy Pathwise Policy Optimization #REPPO 🚨🔥. What if you could have on-policy training without the instability and parameter tuning that plagues #PPO? What if training with deterministic policy gradient just worked?. With our new method it does!
6
10
80
@c_voelcker
Claas Voelcker
17 days
RT @VectorInst: Beyond today's Vector Bytes spotlights, our researchers are presenting additional groundbreaking work at #ICML2025 across t….
0
4
0
@c_voelcker
Claas Voelcker
17 days
RT @SoloGen: Shoutout to my current and former students (@c_voelcker @avery__ma @tylerkastnr @rom72aba Yangchen Pan), student collaborators….
0
4
0
@c_voelcker
Claas Voelcker
22 days
Throwing final things together for Vancouver, polishing a very drafty poster, and finalizing a very promising paper draft (stay tuned 😎). Don't know if I should be super excited or scream 😂. Let me know if you are at @icmlconf and want to grab a coffee and chat about RL!.
0
2
27
@c_voelcker
Claas Voelcker
1 month
Come join us for our @icmlconf social! Hang out with us in beautiful Vancouver in just under 2 weeks time!.
@QueerinAI
QueerInAI
1 month
1/ 💻 Queer in AI is hosting a social at #ICML2025 in Vancouver on 📅 July 16, and you’re invited! Let’s network, enjoy food and drinks, and celebrate our community. Details below….
0
0
9
@c_voelcker
Claas Voelcker
2 months
RT @avery__ma: 🚀Our paper on LLM jailbreaking has been accepted as a spotlight poster at ICML2025! . 🐼PANDAS: Improving Many-shot Jailbreak….
0
2
0
@c_voelcker
Claas Voelcker
2 months
This is a paper that has been cooking for a long time! The first draft was written in 2023, with a very different team, but it would not have been possible to complete without @AnastasiiaPedan finally making an actual probabilistic VAML/MuZero architecture work!.
0
0
0