reference
In the paper 'Evaluating Evaluation Metrics — The Mirage of Hallucination Detection', the authors conducted a large-scale empirical evaluation of 6 diverse sets of hallucination detection metrics across 4 datasets, 37 language models from 5 families, and 5 decoding methods.
Authors
Sources
- Evaluating Evaluation Metrics — The Mirage of Hallucination ... machinelearning.apple.com via serper
- Evaluating Evaluation Metrics -- The Mirage of Hallucination Detection arxiv.org via serper
Referenced by nodes (2)
- hallucination detection concept
- Language Model concept