reference
Evaluation metrics for AI systems include counts of correct and wrong answers, as well as failure counts categorized by comprehension, factualness, specificity, and inference.
Authors
Sources
- EdinburghNLP/awesome-hallucination-detection - GitHub github.com via serper
Referenced by nodes (2)
- inference concept
- factuality concept