claim
Li and Flanigan (2024) found that a model's superior performance in zero- or few-shot settings may stem from exposure to task-related samples during pre-training rather than genuine generalization.
Authors
Sources
- A Survey on the Theory and Mechanism of Large Language Models arxiv.org via serper
Referenced by nodes (3)
- Pre-training concept
- generalization concept
- few-shot learning concept