claim
Different attention patterns can be learned to generate bounded outputs, and interpretability via local ("myopic") analysis can be provably misleading on Transformers, according to Wen et al. (2023).
Authors
Sources
- A Survey on the Theory and Mechanism of Large Language Models arxiv.org via serper
Referenced by nodes (2)
- Transformers concept
- interpretability concept