reference
The paper 'Scaling laws for reward model overoptimization' was published in the International Conference on Machine Learning, pp. 10835–10866.
Authors
Sources
- A Survey on the Theory and Mechanism of Large Language Models arxiv.org via serper