reference
The paper 'Mitigating the alignment tax of RLHF' was published in the Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pp. 580–606.
Authors
Sources
- A Survey on the Theory and Mechanism of Large Language Models arxiv.org via serper
Referenced by nodes (1)
- RLHF concept