measurement
Widely documented phenomena, such as major historical events, famous figures, popular programming languages, and capital cities, appear in billions of training tokens across diverse contexts, whereas obscure entities like small companies, local politicians, minor historical figures, and niche scientific subfields appear in only tens or hundreds of tokens.
Authors
Sources
- Hallucination Causes: Why Language Models Fabricate Facts mbrenndoerfer.com via serper
Referenced by nodes (1)
- Large Language Models concept