measurement
Sanford et al. (2023) introduced the 'sparse averaging' task and demonstrated that Transformers achieve only logarithmic communication complexity, whereas RNNs and feed-forward networks require polynomial communication complexity.

Authors

Sources

Referenced by nodes (1)