claim
The majority of existing benchmarks for evaluating hallucination detection models focus on response-level evaluation.

Authors

Sources

Referenced by nodes (2)