Fact — reference — Knowledge Tree

Yuchun Miao, Sen Zhang, Liang Ding, Rong Bao, Lefei Zhang, and Dacheng Tao introduced InfoRM, an information-theoretic reward modeling approach designed to mitigate reward hacking in Reinforcement Learning from Human Feedback (RLHF), in a 2024 paper presented at the 38th Annual Conference on Neural Information Processing Systems.

Authors

Person: Not available Organization: arXiv
A Survey of Incorporating Psychological Theories in LLMs - arXiv

Sources

A Survey of Incorporating Psychological Theories in LLMs - arXiv arxiv.org arXiv via serper

Referenced by nodes (1)

Reinforcement learning from human feedback (RLHF) concept