reference
Garg et al. (2022) found that Transformers trained on well-defined linear tasks can achieve predictive performance comparable to the least squares algorithm.
Authors
Sources
- A Survey on the Theory and Mechanism of Large Language Models arxiv.org via serper
Referenced by nodes (1)
- Transformers concept