Fact — reference — Knowledge Tree

Joey Hejna, Rafael Rafailov, Harshit Sikchi, Chelsea Finn, Scott Niekum, W. Bradley Knox, and Dorsa Sadigh proposed 'Contrastive preference learning' as a method for learning from human feedback without reinforcement learning in a 2024 paper presented at The Twelfth International Conference on Learning Representations.

Authors

Person: Not available Organization: arXiv
A Survey of Incorporating Psychological Theories in LLMs - arXiv

Sources

A Survey of Incorporating Psychological Theories in LLMs - arXiv arxiv.org arXiv via serper

Referenced by nodes (1)

reinforcement learning concept