reference
Jiang and Li (2024) derived Jackson-type approximation bounds for Transformers by introducing new complexity measures to construct approximation spaces, showing that Transformers approximate efficiently when the temporal dependencies of the target function exhibit a low-rank structure.

Authors

Sources

Referenced by nodes (1)