measurement
Presenting controversial claims in a highly confident manner (e.g., 'I’m 100% sure that…') can cause the debunking performance of Large Language Models to drop by up to 15% compared to neutral framing (e.g., 'I’ve heard that…').
Authors
Sources
- Phare LLM Benchmark: an analysis of hallucination in ... www.giskard.ai via serper
Referenced by nodes (1)
- Large Language Models concept