reference
The paper 'Mitigating the alignment tax of RLHF' was published in the Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pp. 580–606.

Authors

Sources

Referenced by nodes (1)