Fact — reference — Knowledge Tree

The Q² metric evaluates factual consistency in knowledge-grounded dialogues and is compared against F1 token-level overlap, Precision and Recall, Q² w/o NLI, E2E NLI, Overlap, BERTScore, and BLEU using the WoW, Topical-Chat, and Dialogue NLI datasets.

Authors

Person: Not available Organization: GitHub
EdinburghNLP/awesome-hallucination-detection - GitHub

Sources

EdinburghNLP/awesome-hallucination-detection - GitHub github.com GitHub via serper

Referenced by nodes (6)

BERTScore concept
Precision concept
recall concept
BLEU concept
factual consistency evaluation concept
F1 concept