claim
Benchmarks for large language models that test only high-frequency factual questions fail to reveal tail entity hallucination, and benchmarks that test only short responses fail to reveal exposure bias accumulation.
Authors
Sources
- Hallucination Causes: Why Language Models Fabricate Facts mbrenndoerfer.com via serper
Referenced by nodes (3)
- Large Language Models concept
- exposure bias concept
- benchmarks concept