Fact — claim — Knowledge Tree

Adding reward variability in reinforcement learning may reduce premature convergence and improve alignment with human intent.

Authors

Person: Not available Organization: arXiv
A Survey of Incorporating Psychological Theories in LLMs - arXiv

Sources

A Survey of Incorporating Psychological Theories in LLMs - arXiv arxiv.org arXiv via serper

Referenced by nodes (2)

reinforcement learning concept
AI alignment concept