claim
Wu et al. (2025a) proposed the Parallel Loop Transformer (PLT) architecture, which is designed to improve computational efficiency when leveraging recurrence in language models.

Authors

Sources

Referenced by nodes (1)