claim
KGHaluBench statistically estimates the difficulty of each question, aggregates for the assessment, and scales the accuracy accordingly to ensure reliable evaluation.
Authors
Sources
- A Knowledge Graph-Based Hallucination Benchmark for Evaluating ... arxiv.org via serper
Referenced by nodes (1)
- KGHaluBench concept