reference
The paper 'Shampoo: preconditioned stochastic tensor optimization' (published in International Conference on Machine Learning) is cited in the survey 'A Survey on the Theory and Mechanism of Large Language Models' regarding optimization.
Authors
Sources
- A Survey on the Theory and Mechanism of Large Language Models arxiv.org via serper