reference
In the paper 'Evaluating Evaluation Metrics — The Mirage of Hallucination Detection', the authors conducted a large-scale empirical evaluation of 6 diverse sets of hallucination detection metrics across 4 datasets, 37 language models from 5 families, and 5 decoding methods.

Authors

Sources

Referenced by nodes (2)