measurement
In QAFactEval experiments, GPT-4 achieved a hallucination rate below 5%, while LLaMA 2 and DeepSeek exhibited hallucination rates between 20% and 25%.
Authors
Sources
- Survey and analysis of hallucinations in large language models www.frontiersin.org via serper
Referenced by nodes (1)
- hallucination rate concept