claim
Wei et al. (2023) observed that smaller large language models primarily rely on semantic priors from pretraining during in-context learning (ICL) and often disregard label flips in the context, whereas larger models demonstrate the capability to override these priors when faced with label flips.
Authors
Sources
- A Survey on the Theory and Mechanism of Large Language Models arxiv.org via serper
Referenced by nodes (3)
- Large Language Models concept
- In-Context Learning concept
- Pre-training concept