reference
Foundational frameworks by Christiano et al. (2017), Sutton & Barto (2018), and Stiennon et al. (2022) established methods for explicitly translating human judgments into reward signals, operationalizing the insights of Operant Conditioning.
Authors
Sources
- A Survey of Incorporating Psychological Theories in LLMs - arXiv arxiv.org via serper
Referenced by nodes (1)
- Operant Conditioning concept