measurement
The KGHaluBench tri-stage fact verification pipeline achieved 87.74% alignment with human judgment in the validation study, which was 8.56% higher than the automated judge using GPT-3.5-Turbo, which achieved 79.18%.

Authors

Sources

Referenced by nodes (1)