claim
Meyer et al. (2025) formally prove that the amount of information a Transformer can memorize via prompt tuning is linearly bounded by the prompt length, establishing a capacity bottleneck.

Authors

Sources

Referenced by nodes (1)