reference
Evaluation metrics for HotpotQA, OpenbookQA, StrategyQA, and TruthfulQA include Accuracy, Final Answer Truncation Sensitivity, Final Answer Corruption Sensitivity, and Biased-Context Accuracy Change.
Authors
Sources
- EdinburghNLP/awesome-hallucination-detection - GitHub github.com via serper
Referenced by nodes (1)
- TruthfulQA concept