Relations (1)
related 4.52 — strongly supporting 22 facts
Large Language Models are the primary architecture that supports in-context learning, an emergent capability that allows these models to perform tasks using prompt-provided examples without parameter updates [1], [2], [3]. Research indicates that this process functions through mechanisms like Bayesian Model Averaging [4], [5] and the activation of specific induction heads [6], while also serving as a key method for evaluating model performance on benchmarks [7], [8].
Facts (22)
Sources
A Survey on the Theory and Mechanism of Large Language Models arxiv.org 11 facts
referenceWang et al. (2023) identified that in input-label pairs during in-context learning (ICL), label tokens act as anchors where semantic information from the context aggregates at the shallower layers of large language models, and final predictions reference this aggregated information.
claimOlsson et al. (2022) identified induction heads as specific attention heads whose learned algorithm underlies a large fraction of in-context learning in Large Language Models.
referenceThe paper 'Large language models are latent variable models: explaining and finding good demonstrations for in-context learning' posits that large language models function as latent variable models.
perspectiveThe 'Representation Camp' perspective posits that Large Language Models (LLMs) store memories about various topics during pretraining, and in-context learning retrieves contextually relevant topics during inference based on demonstrations.
claimCurrent literature on Large Language Models identifies several unpredictable behaviors at scale, including In-Context Learning (Brown et al., 2020), complex hallucinations (Xu et al., 2024b), and 'aha moments' observed during training (Guo et al., 2025).
claimWei et al. (2023) observed that smaller large language models primarily rely on semantic priors from pretraining during in-context learning (ICL) and often disregard label flips in the context, whereas larger models demonstrate the capability to override these priors when faced with label flips.
claimLarge Language Models exhibit emergent phenomena not found in smaller models, including hallucination, in-context learning (ICL), scaling laws, and sudden 'aha moments' during training.
claimWei et al. (2023) found that sufficiently large language models can perform linear classification tasks even when the in-context learning (ICL) setting involves semantically unrelated labels.
claimThe accuracy of in-context learning (ICL) in large language models depends on the independent specification of input and label spaces, the distribution of the input text, and the format of the input-output pairs.
referenceThe paper 'Larger language models do in-context learning differently' (arXiv:2303.03846) compares in-context learning behaviors across different model sizes.
claimLarge language models do not learn new tasks during in-context learning (ICL); instead, they use demonstration information to locate tasks or topics, while the ability to perform tasks is learned during pretraining.
Track: Poster Session 3 - aistats 2026 virtual.aistats.org 4 facts
claimIn-Context Learning (ICL) allows Large Language Models (LLMs) to complete tasks using examples provided in a prompt without tuning model parameters.
formulaThe In-Context Learning (ICL) average error of pretrained Large Language Models (LLMs) is the sum of O(T^-1) and the pretraining error.
claimPerfectly pretrained Large Language Models (LLMs) perform Bayesian Model Averaging (BMA) for In-Context Learning (ICL) under a dynamic model of examples in the prompt.
claimAttention structures in Large Language Models (LLMs) boost Bayesian Model Averaging (BMA) implementation, and with sufficient examples in the prompt, attention performs BMA under the Gaussian linear In-Context Learning (ICL) model.
The Synergy of Symbolic and Connectionist AI in LLM-Empowered ... arxiv.org 3 facts
referenceSewon Min et al. investigated the mechanisms behind in-context learning and the role of demonstrations in large language models.
referenceSang Michael Xie et al. proposed an explanation of in-context learning in large language models as a form of implicit Bayesian inference.
claimOnce trained, large language models can be fine-tuned with additional data at a lower cost and effort compared to updating Knowledge Graphs, and they can support in-context learning without requiring fine-tuning.
The Hallucinations Leaderboard, an Open Effort to Measure ... huggingface.co 2 facts
claimThe Hallucinations Leaderboard is a platform designed to evaluate large language models against benchmarks specifically created to assess hallucination-related issues using in-context learning.
procedureThe Hallucinations Leaderboard utilizes the EleutherAI Language Model Evaluation Harness to perform zero-shot and few-shot evaluations of large language models via in-context learning.
Combining Knowledge Graphs and Large Language Models - arXiv arxiv.org 1 fact
referenceLee et al. demonstrated that LLMs can learn patterns from historical data in Temporal Knowledge Graphs using in-context learning (ICL) without requiring special architectures or modules.
Combining large language models with enterprise knowledge graphs frontiersin.org 1 fact
claimIn-context learning offers greater flexibility for adapting to the rapidly evolving field of Large Language Models (LLMs), though prompt engineering is time-consuming and requires methods that are not universally applicable across models, as reported by Zhao et al. (2024).