reference
Garg et al. (2022) found that Transformers trained on well-defined linear tasks can achieve predictive performance comparable to the least squares algorithm.

Authors

Sources

Referenced by nodes (1)