reference
Gatmiry et al. (2024) studied whether looped Transformers can implement multi-step gradient descent in an in-context learning setting.
Authors
Sources
- A Survey on the Theory and Mechanism of Large Language Models arxiv.org via serper
Referenced by nodes (1)
- In-Context Learning concept