measurement
GPT-4 demonstrated a hallucination rate reduction of approximately 15% compared to LLaMA 2 on the TruthfulQA benchmark.
Authors
Sources
- Survey and analysis of hallucinations in large language models www.frontiersin.org via serper
Referenced by nodes (1)
- TruthfulQA concept