reference
The paper 'Learning and transferring sparse contextual bigrams with linear transformers' was published in Advances in Neural Information Processing Systems and is cited in sections 4.2.2 and 5.3.1 of 'A Survey on the Theory and Mechanism of Large Language Models'.

Authors

Sources

Referenced by nodes (1)