Fact — claim — Knowledge Tree

Reinforcement Learning from Human Feedback (RLHF) in Large Language Model development operationalizes Operant Conditioning theory by using repeated feedback to adapt model behaviors to favor outputs that yield higher reward signals.

Authors

Person: Not available Organization: arXiv
A Survey of Incorporating Psychological Theories in LLMs - arXiv

Sources

A Survey of Incorporating Psychological Theories in LLMs - arXiv arxiv.org arXiv via serper

Referenced by nodes (1)

Reinforcement learning from human feedback (RLHF) concept