claim
Researchers are investigating whether input prompts act as latent variables that locate a specific task within a pre-trained distribution, or if the model architecture implicitly executes meta-optimization algorithms like gradient descent to adapt to provided examples.
Authors
Sources
- A Survey on the Theory and Mechanism of Large Language Models arxiv.org via serper
Referenced by nodes (1)
- gradient descent concept