reference
Siyan Zhao, Daniel Israel, Guy Van den Broeck, and Aditya Grover define prefilling in transformer-based large language models as the computation of the key-value (KV) cache for input tokens in the prompt prior to autoregressive generation.
Authors
Sources
- Track: Poster Session 3 - aistats 2026 virtual.aistats.org via serper
Referenced by nodes (2)
- Large Language Models concept
- Transformer concept