claim
Zhou et al. (2022) approach the emerging visual grouping phenomenon in Vision Transformers from the perspective of the information bottleneck, showing that the iterative solution to the information bottleneck objective can be expressed as self-attention.
Authors
Sources
- A Survey on the Theory and Mechanism of Large Language Models arxiv.org via serper
Referenced by nodes (1)
- self-attention mechanism concept