reference
Shen et al. (2024) authored the paper titled 'The trickle-down impact of reward inconsistency on RLHF', which was presented at The Twelfth International Conference on Learning Representations in 2024.

Authors

Sources

Referenced by nodes (1)