Fact — measurement — Knowledge Tree

Evaluation metrics for list-based questions on Wikidata and Wiki-Category List include test precision and the average number of positive and negative hallucination entities; MultiSpanQA uses F1, Precision, and Recall; and longform generation of biographies uses FactScore.

Authors

Person: Not available Organization: GitHub
EdinburghNLP/awesome-hallucination-detection - GitHub

Sources

EdinburghNLP/awesome-hallucination-detection - GitHub github.com GitHub via serper

Referenced by nodes (4)

Precision concept
recall concept
Wikidata entity
F1 concept