Fact — claim — Knowledge Tree

The design and selection of deep learning model architectures are influenced by both the latent characteristics of the training data and the training paradigm adopted, such as next-token prediction (NTP) or masked language modeling (MLM).

Authors

Person: Not available Organization: arXiv
A Survey on the Theory and Mechanism of Large Language Models

Sources

A Survey on the Theory and Mechanism of Large Language Models arxiv.org arXiv via serper

Referenced by nodes (2)

deep learning concept
training data concept