claim
Oymak et al. (2023) theoretically establish that softmax prompt attention is more expressive than self-attention or linear prompt attention in the context of mixture models.
Authors
Sources
- A Survey on the Theory and Mechanism of Large Language Models arxiv.org via serper
Referenced by nodes (1)
- self-attention mechanism concept