Fact — procedure — Knowledge Tree

To assess the faithfulness of models to original documents in summarisation tasks, the Hallucination Leaderboard uses ROUGE (measuring overlap between generated and reference text), factKB (a generalisable model-based metric for factuality evaluation), and BERTScore-Precision (which computes similarity between two texts using token representation similarities).

Authors

Person: Not available Organization: Hugging Face
The Hallucinations Leaderboard, an Open Effort to Measure ...

Sources

The Hallucinations Leaderboard, an Open Effort to Measure ... huggingface.co Hugging Face via serper

Referenced by nodes (2)

ROUGE concept
Hallucination Leaderboard concept