claim
Scalable oversight, as defined by Bowman et al. (2022), is a technical challenge that seeks to enable relatively weak human supervisors to reliably evaluate and align AI systems that are far stronger and more complex than themselves.
Authors
Sources
- A Survey on the Theory and Mechanism of Large Language Models arxiv.org via serper
Referenced by nodes (1)
- artificial intelligence concept