Relations (1)

related 2.32 — strongly supporting 3 facts

Hallucination detection is a critical evaluation task within Question Answering, as evidenced by benchmarks like RAGTruth [1] and the analysis of metric performance in Question Answering contexts [2], [3], and [4].

Facts (3)

Sources
Re-evaluating Hallucination Detection in LLMs - arXiv arxiv.org arXiv 1 fact
claimThe authors of the paper 'Re-evaluating Hallucination Detection in LLMs' demonstrate that prevailing overlap-based metrics systematically overestimate hallucination detection performance in Question Answering tasks, which leads to illusory progress in the field.
EdinburghNLP/awesome-hallucination-detection - GitHub github.com GitHub 1 fact
claimROUGE-based evaluation systematically overestimates hallucination detection performance in Question Answering tasks.
Detecting hallucinations with LLM-as-a-judge: Prompt ... - Datadog datadoghq.com Aritra Biswas, Noé Vernier · Datadog 1 fact
referenceRAGTruth is a human-labeled benchmark for hallucination detection that covers three tasks: question answering, summarization, and data-to-text translation.