exposure bias ↔ hallucination

Relations (1)

related 2.81 — strongly supporting 6 facts

Exposure bias is identified as a primary driver of hallucinations in large language models, as it creates a training-inference mismatch where early generation errors cascade into subsequent tokens {fact:1, fact:6}. This phenomenon is further exacerbated by longer sequence lengths [1] and is recognized as one of the four core causes of model hallucinations [2], contributing to an irreducible floor of error even in high-frequency training scenarios [3].

Facts (6)

Sources

Hallucination Causes: Why Language Models Fabricate Facts mbrenndoerfer.com M. Brenndoerfer · mbrenndoerfer.com 6 facts

claimExposure bias causes hallucinations because teacher forcing creates a training-inference mismatch where the model is never trained to handle its own errors, causing early mistakes in generation to cascade across subsequent tokens.

claimExposure bias is a cause of hallucination in large language models that arises from a mismatch between training efficiency and inference realism.

claimExposure bias in large language models does not require the model to lack the correct answer; rather, hallucinations arise because an error changes the input distribution, activating incorrect associations despite the model potentially possessing reliable knowledge.

claimThe max_new_tokens parameter controls sequence length in large language models, and longer generations face higher cumulative exposure bias divergence, which increases hallucination risk as the sequence grows.

claimLarge language model hallucinations are driven by the interaction of four causes: training data issues (noisy web data), knowledge gaps (questions about tail entities), completion pressure (generating confident-sounding responses), and exposure bias (early errors compounding in long-form answers).

claimLarge language models exhibit a 3% floor of irreducible hallucination even at high training frequencies, which is caused by exposure bias, completion pressure, and conflicting signals in training data.