claim
Zhai et al. (2025) theorize that pre-training learns the “contexture,” defined as the top-singular functions of the association between inputs and their contexts, and that a representation learning this contexture is optimal for compatible downstream tasks.
Authors
Sources
- A Survey on the Theory and Mechanism of Large Language Models arxiv.org via serper
Referenced by nodes (1)
- Pre-training concept