claim
Training data issues cause hallucinations because web corpora contain factual errors, misinformation, and knowledge imbalances that the next-token prediction objective cannot distinguish from accurate content, leading the model to learn errors with the same confidence as truths.

Authors

Sources

Referenced by nodes (1)