reference
The paper 'Iterative preference learning from human feedback: bridging theory and practice for rlhf under kl-constraint' was published in the International Conference on Machine Learning, pages 54715–54754.

Authors

Sources

Referenced by nodes (1)