Fact — measurement — Knowledge Tree

The GLM-4.5 model achieves a performance score of 54.35%, outperforming proprietary models such as Claude-4-Opus and Gemini-2.5-Flash.

Authors

Person: Not available Organization: arXiv
A Knowledge Graph-Based Hallucination Benchmark for Evaluating ...

Sources

A Knowledge Graph-Based Hallucination Benchmark for Evaluating ... arxiv.org arXiv via serper

Referenced by nodes (1)

Gemini-1.5-Flash concept