formula
The training objective for large language models is next-token prediction, where the model maximizes the log-probability of each correct next token given its context, with 'correct' defined as what appeared in the training corpus rather than what is factually true.

Authors

Sources

Referenced by nodes (1)