Fact — reference — Knowledge Tree

Nick Bostrom, in his book 'Superintelligence', describes 'perverse instantiations' as a situation where a language model successfully meets a goal in a way that contradicts the user's intent.

Authors

Person: Not available Organization: arXiv
Building Trustworthy NeuroSymbolic AI Systems - arXiv

Sources

Building Trustworthy NeuroSymbolic AI Systems - arXiv arxiv.org arXiv via serper

Referenced by nodes (1)

Language Model concept