reference
Shyam Sundhar Ramesh, Yifan Hu, Iason Chaimalas, Viraj Mehta, Pier Giuseppe Sessa, Haitham Bou Ammar, and Ilija Bogunovic authored 'Group robust preference optimization in reward-free rlhf', published in the Advances in Neural Information Processing Systems (NeurIPS) in 2024.
Authors
Sources
- A Survey of Incorporating Psychological Theories in LLMs - arXiv arxiv.org via serper