claim
Oymak et al. (2023) characterize how gradient descent naturally guides prompts to focus on sparse, task-relevant tokens.
Authors
Sources
- A Survey on the Theory and Mechanism of Large Language Models arxiv.org via serper
Referenced by nodes (2)
- gradient descent concept
- prompts concept