reference
The MultiHal benchmark is a factual language modeling benchmark that extends previous benchmarks such as Shroom2024, HaluEval, HaluBench, TruthfulQA, Felm, Defan, and SimpleQA by mining relevant knowledge graph paths from Wikidata.

Authors

Sources

Referenced by nodes (3)