procedure
During training, large language models use a technique called teacher forcing, where the model conditions the probability of the next token on ground-truth previous tokens from the training data rather than on its own previous predictions.
Authors
Sources
- Hallucination Causes: Why Language Models Fabricate Facts mbrenndoerfer.com via serper
Referenced by nodes (1)
- Large Language Models concept