reference
The paper 'Transformers learn in-context by gradient descent' was published in the International Conference on Machine Learning, pp. 35151–35174.
Authors
Sources
- A Survey on the Theory and Mechanism of Large Language Models arxiv.org via serper
Referenced by nodes (4)
- Transformers concept
- In-Context Learning concept
- International Conference on Machine Learning entity
- gradient descent concept