claim
The researchers validated the deception-related features identified in Llama 70B using TruthfulQA, a standard benchmark for common factual misconceptions, demonstrating that amplifying these features increases the model's willingness to state falsehoods.
Authors
Sources
- The Evidence for AI Consciousness, Today - AI Frontiers ai-frontiers.org via serper
Referenced by nodes (1)
- consciousness concept