Fact — claim — Knowledge Tree

Models with higher PS (Prompt Sensitivity) and MV (Model Variance) metrics generally performed worse on factuality benchmarks like TruthfulQA (Lin et al., 2022) and HallucinationEval (Wu et al., 2023), while models with low MV, such as GPT-4, achieved better TruthfulQA scores.

Authors

Person: Not available Organization: Frontiers
Survey and analysis of hallucinations in large language models

Sources

Survey and analysis of hallucinations in large language models www.frontiersin.org Frontiers via serper

Referenced by nodes (2)

TruthfulQA concept
Prompt Sensitivity concept