Claas Voelcker @c_voelcker X Profile

Claas Voelcker

@c_voelcker

Followers

1K

Following

2K

Media

57

Statuses

2K

"All models are wrong, but some are useful" "Do not disfigure the soul" - PostDoc UT Austin, PhD UofT, RL researcher unfocused on many things, he/him, 🏳️‍🌈

Toronto, Canada

Joined October 2018

Don't wanna be here? Send us removal request.

Claas Voelcker

@c_voelcker

5 days

@JMarakiii And Glenda Crips, the CEO of @VectorInst of course! (Somebody did offer her a chair, we were not being mean 😅).

0

2

Claas Voelcker

@c_voelcker

5 days

With the minister and the incredible hat designed by @JMarakiii

1

0

5

Claas Voelcker

@c_voelcker

5 days

A truly unexpected surprise and honor to be congratulated on my PhD by @FP_Champagne during his visit to the @VectorInst today! Canadian science is booming, we just need the funding to contribute even more innovation in the future!.

1

0

18

Claas Voelcker

@c_voelcker

6 days

RT @marcel_hussing: We posted about this during ICML and some people may have missed it so here it goes again. I started my PhD fine-tuning….

0

2

0

Claas Voelcker

@c_voelcker

9 days

1

0

4

Claas Voelcker

@c_voelcker

9 days

Thesis is already up online for curious folks (will get a nicer home soon) and then its time to say goodbye 😭 to @UofT and @VectorInst and move on to @UTAustin where I will join @PeterStone_TX and @yayitsamyzhang for more fun with RL, robots, and more 🎉!.

github.com

Contribute to cvoelcker/phd-thesis development by creating an account on GitHub.

2

1

22

Claas Voelcker

@c_voelcker

9 days

Huge shout-out to @SoloGen and @igilitschenski for putting up with me, my relentless skepticism, and hand-wavy ideas for so many years! Thanks to @WilCunningham @florian_shkurti and Philip Thomas for letting me get away with my thesis 😁.

1

0

6

Claas Voelcker

@c_voelcker

9 days

But now, finally, it is done! You may now all call me Dr. Claas (and then immediately laugh at me for being pretentious enough to use the title). I am super happy/relieved/exhausted to announce that I passed my thesis defense yesterday!. #PhDone #mltwitter

Claas Voelcker

@c_voelcker

8 months

@QueerinAI Not yet a Dr., don’t jinx it 😁.

21

1

81

Claas Voelcker

@c_voelcker

16 days

RT @TabulaRobot: Announcing our EXAIT@ICML workshop paper: CURATE!. Have a difficult target task distribution with sparse rewards that you….

0

6

0

Claas Voelcker

@c_voelcker

17 days

Find all this and more in our preprint and please, give us feedback! We want to make this algorithm as useful for all of you as possible!.

arxiv.org

Score-function policy gradients have delivered strong results in game-playing, robotics and language-model fine-tuning. Yet its high-variance often undermines training stability. On the other...

0

9

Claas Voelcker

@c_voelcker

17 days

Big shoutout to the great team that made this work Axel Brunnbauer, @marcel_hussing, @mic_nau with supervisors @pabbeel, Radu Grosu, @EricREaton, @SoloGen, @igilitschenski 🎉 All of them have thought deeply about how to make value learning work, and it shows!.

1

0

7

Claas Voelcker

@c_voelcker

17 days

The result is REPPO, which trains as fast as PPO, without replay buffers, and with minimal hyperparameter tuning. If you don't believe us, take our code and test it! We provide implementations in both jax and torch (but jax is faster 😜):

1

0

8

Claas Voelcker

@c_voelcker

17 days

By building on a few crucial building blocks, such as maximum entropy RL, trust regions, and modern critic architectures, we can leverage the massive stability improvements that come with first-order policy gradients without suffering from suboptimal asymptotic performance.

1

0

5

Claas Voelcker

@c_voelcker

17 days

🔥🚨 Preprint alert: Relative Entropy Pathwise Policy Optimization #REPPO 🚨🔥. What if you could have on-policy training without the instability and parameter tuning that plagues #PPO? What if training with deterministic policy gradient just worked?. With our new method it does!

6

10

80

Claas Voelcker

@c_voelcker

17 days

RT @VectorInst: Beyond today's Vector Bytes spotlights, our researchers are presenting additional groundbreaking work at #ICML2025 across t….

0

4

0

Claas Voelcker

@c_voelcker

17 days

RT @SoloGen: Shoutout to my current and former students (@c_voelcker @avery__ma @tylerkastnr @rom72aba Yangchen Pan), student collaborators….

0

4

0

Claas Voelcker

@c_voelcker

22 days

Throwing final things together for Vancouver, polishing a very drafty poster, and finalizing a very promising paper draft (stay tuned 😎). Don't know if I should be super excited or scream 😂. Let me know if you are at @icmlconf and want to grab a coffee and chat about RL!.

0

2

27

Claas Voelcker

@c_voelcker

1 month

Come join us for our @icmlconf social! Hang out with us in beautiful Vancouver in just under 2 weeks time!.

QueerInAI

@QueerinAI

1 month

1/ 💻 Queer in AI is hosting a social at #ICML2025 in Vancouver on 📅 July 16, and you’re invited! Let’s network, enjoy food and drinks, and celebrate our community. Details below….

0

9

Claas Voelcker

@c_voelcker

2 months

RT @avery__ma: 🚀Our paper on LLM jailbreaking has been accepted as a spotlight poster at ICML2025! . 🐼PANDAS: Improving Many-shot Jailbreak….

0

2

0

Claas Voelcker

@c_voelcker

2 months

This is a paper that has been cooking for a long time! The first draft was written in 2023, with a very different team, but it would not have been possible to complete without @AnastasiiaPedan finally making an actual probabilistic VAML/MuZero architecture work!.

0