claim
The weighted factual accuracy of web training data is driven down by the high volume of low-accuracy source types, such as SEO content and social media, despite the presence of high-accuracy curated sources like Wikipedia and academic papers.
Authors
Sources
- Hallucination Causes: Why Language Models Fabricate Facts mbrenndoerfer.com via serper
Referenced by nodes (2)
- Wikipedia entity
- social media concept