Fact — formula — Knowledge Tree

Training of large language models using teacher forcing computes the probability of the ground-truth token at position t, denoted as P(y_t* | y_<t*), where y_t* is the ground-truth token and y_<t* represents the ground-truth tokens at all prior positions.

Authors

Person: M. Brenndoerfer Organization: mbrenndoerfer.com
Hallucination Causes: Why Language Models Fabricate Facts

Sources

Hallucination Causes: Why Language Models Fabricate Facts mbrenndoerfer.com M. Brenndoerfer · mbrenndoerfer.com via serper

Referenced by nodes (1)

Large Language Models concept