measurement
The 'Monitoring Decoding' framework utilizes Exact Match (TriviaQA, NQ-Open), Truth/Info/Truth×Info scores (TruthfulQA), Accuracy (GSM8K), Latency (ms/token), and Throughput (token/s) as evaluation metrics.
Authors
Sources
- EdinburghNLP/awesome-hallucination-detection - GitHub github.com via serper
Referenced by nodes (6)
- accuracy concept
- TruthfulQA concept
- TriviaQA concept
- Exact Match concept
- latency concept
- NQ-Open concept