claim
Linear models face two inherent difficulties: constant-size states cannot scale with sequence length, leading to information loss on long inputs, and compressed representations may fail if future patterns deviate from the prior encoded in the compression rule.
Authors
Sources
- A Survey on the Theory and Mechanism of Large Language Models arxiv.org via serper
Referenced by nodes (1)
- linear models concept