reference
Evaluation metrics for AI systems include counts of correct and wrong answers, as well as failure counts categorized by comprehension, factualness, specificity, and inference.

Authors

Sources

Referenced by nodes (2)