measurement
The evaluation metrics 'EM on All', 'Has answer', and 'IDK' are used on the MNLI, SQuAD 2.0, and ACE-whQA datasets.
Authors
Sources
- EdinburghNLP/awesome-hallucination-detection - GitHub github.com via serper
Referenced by nodes (1)
- SQuAD concept