Fact — measurement — Knowledge Tree

The KGHaluBench entity-level filter at a 0.700 threshold achieved 5.65% higher alignment with human judgment and 48.78% higher recall compared to an automated judge using GPT-3.5-Turbo.

Authors

Person: Not available Organization: arXiv
A Knowledge Graph-Based Hallucination Benchmark for Evaluating ...

Sources

A Knowledge Graph-Based Hallucination Benchmark for Evaluating ... arxiv.org arXiv via serper

Referenced by nodes (2)

KGHaluBench concept
entity-level filtering concept