measurement
Evaluation of faithfulness between predicted responses and ground-truth knowledge uses Critic, Q², BERT F1, and F1 as metrics, and utilizes datasets including Wizard-of-Wikipedia (WoW), DSTC9 and DSTC11 extensions of MultiWoZ 2.1, and FaithDial.
Authors
Sources
- EdinburghNLP/awesome-hallucination-detection - GitHub github.com via serper
Referenced by nodes (2)
- F1 concept
- faithfulness concept