reference
Evaluation metrics for AI systems include fluency (measured by MAUVE), correctness (measured by EM recall for ASQA, recall-5 for QAMPARI, and claim recall for ELI5), and citation quality (measured by citation recall and citation precision).
Authors
Sources
- EdinburghNLP/awesome-hallucination-detection - GitHub github.com via serper
Referenced by nodes (1)
- ELI concept