reference
Gatmiry et al. (2024) studied whether looped Transformers can implement multi-step gradient descent in an in-context learning setting.

Authors

Sources

Referenced by nodes (1)