claim
Dai et al. (2022) assert that Transformers implicitly fine-tune during in-context learning inference, building upon the dual form of the attention mechanism originally proposed by Aiserman et al. (1964) and Irie et al. (2022).
Authors
Sources
- A Survey on the Theory and Mechanism of Large Language Models arxiv.org via serper
Referenced by nodes (2)
- Transformers concept
- attention mechanism concept