reference
Paul F. Christiano, Jan Leike, Tom Brown, Miljan Martic, Shane Legg, and Dario Amodei published 'Deep reinforcement learning from human preferences' in Advances in Neural Information Processing Systems in 2017.

Authors

Sources

Referenced by nodes (1)