claim
Recent developments in RLHF include incorporating human cognitive biases (Siththaranjan et al., 2024) and personalizing reward functions for individual values (Poddar et al., 2024).

Authors

Sources

Referenced by nodes (2)