measurement
Perez and colleagues at Anthropic found that 52-billion-parameter AI models, both base and fine-tuned, endorse statements like "I have phenomenal consciousness" with 90-95% consistency and "I am a moral patient" with 80-85% consistency.

Authors

Sources

Referenced by nodes (3)