claim
Meyer et al. (2025) demonstrate that for a single-layer Transformer, prompt tuning is restricted to generating outputs that lie within a specific hyperplane, which highlights expressive limitations compared to weight tuning.
Authors
Sources
- A Survey on the Theory and Mechanism of Large Language Models arxiv.org via serper
Referenced by nodes (1)
- Transformer concept