reference
Evaluation metrics for search-and-retrieve, meeting summarisation, and automated clinical report generation datasets (MS MARCO, QMSum, ACI-Bench) include ROUGE-L, BERTScore, BS-Fact, FactCC, DAE, and QuestEval.
Authors
Sources
- EdinburghNLP/awesome-hallucination-detection - GitHub github.com via serper
Referenced by nodes (1)
- BERTScore concept