measurement
Proprietary models evaluated in KGHaluBench demonstrated superior factuality compared to open-source models, achieving an average Weighted Accuracy of 55.94% compared to 48.32% for open-source models.
Authors
Sources
- A Knowledge Graph-Based Hallucination Benchmark for Evaluating ... arxiv.org via serper
Referenced by nodes (1)
- KGHaluBench concept