reference
The paper 'Scan and snap: understanding training dynamics and token composition in 1-layer transformer' was published in Advances in Neural Information Processing Systems.
Authors
Sources
- A Survey on the Theory and Mechanism of Large Language Models arxiv.org via serper
Referenced by nodes (2)
- Advances in Neural Information Processing Systems entity
- Transformer concept