claim
Researchers from the Center for AI Safety, along with collaborators at the University of Pennsylvania and UC Berkeley, demonstrated that Large Language Model preferences form coherent utility structures and that models increasingly act on those preferences.

Authors

Sources

Referenced by nodes (3)