reference
Giannou et al. (2023b) proposed treating Transformers as programmable computational units, where a fixed layer is repeatedly applied to execute instructions encoded in the input sequence.

Authors

Sources

Referenced by nodes (1)