claim
Attention structures in Large Language Models (LLMs) boost Bayesian Model Averaging (BMA) implementation, and with sufficient examples in the prompt, attention performs BMA under the Gaussian linear In-Context Learning (ICL) model.
Authors
Sources
- Track: Poster Session 3 - aistats 2026 virtual.aistats.org via serper
Referenced by nodes (2)
- Large Language Models concept
- In-Context Learning concept