claim
Simple questions used in benchmarks by Bordes et al. (2015) and Joshi et al. (2017) are typically short, open-ended queries with a single, verifiable answer that require large language models to draw on internalized representations but fail to capture multiple elements of deeper knowledge.
Authors
Sources
- A Knowledge Graph-Based Hallucination Benchmark for Evaluating ... arxiv.org via serper
Referenced by nodes (1)
- Large Language Models concept