#OverOptimization X Hashtag

Explore tweets tagged as #OverOptimization

Harshit Sikchi (at RLC 25)

@harshit_sikchi

1 year

Direct alignment algorithms (DAAs) are fast and have easy-to-tune hyperparameters, but they still suffer from a form of reward overoptimization*. We study this in detail 👇

1

4

24

Mario Joos

@MarioJooss

7 months

Here's the truth about overoptimization on YouTube.

9

14

158

Rafael Rafailov @ NeurIPS

@rm_rafailov

1 year

After the LLaMa 3.1 release and ICML, I wan to highlight our paper "Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms". TL;DR we explore the dynamics of over-optimization in DPO/IPO/SLiC and find similiar "reward hacking" issues as online RLHF.👇

2

47

251

Oliver Lopez-Corona (Tecozcacuauhtli)

@otrasenda_AC

1 year

Overoptimization. Is there a hard limit?

0

3

Arnav Bathla

@arnavbathla20

10 months

Overoptimization is just optimization. Micromanaging is just managing. Overreacting is just reacting.

1

0

7

Tanishq Mathew Abraham, Ph.D.

@iScienceLuvr

2 years

Iterative Data Smoothing: Mitigating Reward Overfitting and Overoptimization in RLHF. abs: Proposes that the root cause of reward overfitting and overoptimization in RLHF is the inadequacyof the cross-entropy loss for long-tailed preference datasets. This

0

26

170

Nassim Nicholas Taleb

@nntaleb

7 months

The antifragility of system comes from the mortality of its components; immortality blocks evolution. Work for the immortality of the collective. [On top of my disgust for non-stoical neurotic overoptimization].h/t @Gregoresate

Nassim Nicholas Taleb

@nntaleb

7 months

@bryan_johnson Looks like you didn't understand much from Skin in the Game. It states that we are not supposed to be immortal; only our genes. This is aside from, in my general work, the contempt, perhaps even disgust I have for your brand of non-stoical neurotic overoptimization.

112

211

2K

PapersAnon

@papers_anon

1 year

Correcting the Mythos of KL-Regularization: Direct Alignment without Overparameterization via Chi-squared Preference Optimization. One-line change to DPO to implement the principle of pessimism to alleviate overoptimization. No models tested. Potential paper there. Links below

1

11

fly51fly

@fly51fly

1 year

[LG] Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms.R Rafailov, Y Chittepu, R Park, H Sikchi. [Stanford University & UMass Amherst] (2024). - Direct Alignment Algorithms (DAAs) like Direct Preference Optimization have

0

10

34

Tanishq Mathew Abraham, Ph.D.

@iScienceLuvr

2 years

Confronting Reward Model Overoptimization with Constrained RLHF. abs: Studies reward overoptimization for composite reward models and evaluating various constrained RLHF approaches to maximize reward scores till they reach "proxy points"

2

11

73

Tanishq Mathew Abraham, Ph.D.

@iScienceLuvr

2 years

Reward Model Ensembles Help Mitigate Overoptimization. abs: RLHF can struggle with overoptimization, where the policy gets better according to the learned reward model but its true reward is actually worse. Building off Gao et al. 2023, here it is

0

13

71

AK

@_akhaliq

1 year

Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms. Reinforcement Learning from Human Feedback (RLHF) has been crucial to the recent success of Large Language Models (LLMs), however, it is often a complex and brittle process.

1

24

120

fly51fly

@fly51fly

2 years

[LG] Iterative Data Smoothing: Mitigating Reward Overfitting and Overoptimization in RLHF.B Zhu, M I. Jordan, J Jiao [UC Berkeley] (2024). - The paper investigates issues of reward overfitting and overoptimization in reinforcement learning from human

0

5

13

Abbas

@Abbasshaikh42

7 months

“Looks like you didn't understand much from Skin in the Game. It states that we are not supposed to be immortal; only our genes. This is aside from, in my general work, the contempt, perhaps even disgust I have for your brand of non-stoical neurotic overoptimization.”

0

5

FAR.AI

@farairesearch

8 months

Come chat with us about our AI Safety papers at #NeurIPS2024!.12/11: 💥Catastrophic Goodhart: overoptimization in RLHF.12/12: ⚙️ Analysing the Generalisation and Reliability of Steering Vectors.12/12: 🌀 Hypothesis Testing the Circuit Hypothesis in LLMs.12/13: 🔬 InterpBench.🧵👇

1

5

kalomaze

@kalomaze

3 months

the overoptimization issues with RLHF/classifier-based RL in my case (for GRPO) seem completely mitigated by hard capping the reward after >50% probability preferred.(i also multiply the values * 2, so 1.0 reward = "at least 50% or greater preference", 0.5 = "25% preference")

3

1

66

Mr. Eli W. Jones

@mreliwjones

7 months

there's a sort of instinctive revulsion some of us have to "neurotic overoptimization". it comes from past experiences of being swept up in groups bent on particular forms of overoptimization, and it having disastrous consequences to the group

1

0

Frederik Baun

@thefrederikbaun

1 year

Here’s one thing Andrew Tate and Naval Ravikant have in common – and both get wrong. (And yes, it’s connected to this freshly baked bread.). They’re obsessed with “overoptimization.”. Bropreneurs and internet gurus alike preach we should cut out mundane tasks, such as cooking

1

0

1

Harshit Sikchi (at RLC 25)

@harshit_sikchi

11 months

Our cross-university(s) collaborative work on "Scaling laws for Reward Model Overoptimization in Direct Alignment Algorithms" is accepted at @NeurIPSConf!.

Rafael Rafailov @ NeurIPS

@rm_rafailov

1 year

After the LLaMa 3.1 release and ICML, I wan to highlight our paper "Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms". TL;DR we explore the dynamics of over-optimization in DPO/IPO/SLiC and find similiar "reward hacking" issues as online RLHF.👇

0

4

20