Fact — claim — Knowledge Tree

The ROUGE evaluation metric fails to recognize semantic equivalence between different phrasings, such as 'elevation' and 'relief' in the context of topographic maps, leading to lower scores due to lexical mismatch.

Authors

Person: Not available Organization: arXiv
Re-evaluating Hallucination Detection in LLMs - arXiv

Sources

Re-evaluating Hallucination Detection in LLMs - arXiv arxiv.org arXiv via serper

Referenced by nodes (1)

ROUGE concept