claim
Oymak et al. (2023) characterize how gradient descent naturally guides prompts to focus on sparse, task-relevant tokens.

Authors

Sources

Referenced by nodes (2)