claim
Incorporating the Delta Rule into Transformers has been explored as a method to strengthen the expressive power of these models.
Authors
Sources
- A Survey on the Theory and Mechanism of Large Language Models arxiv.org via serper
Referenced by nodes (1)
- Transformers concept