claim
Many hallucination detection methods use ROUGE as a primary correctness metric, often applying threshold-based heuristics where responses with low ROUGE overlap to reference answers are labeled as hallucinated.
Authors
Sources
- Re-evaluating Hallucination Detection in LLMs - arXiv arxiv.org via serper
Referenced by nodes (2)
- hallucination detection concept
- ROUGE concept