reference
The paper 'Adam can converge without any modification on update rules' was published in Advances in neural information processing systems 35 (pp. 28386–28399) and is cited in section 4.3.2 of 'A Survey on the Theory and Mechanism of Large Language Models'.

Authors

Sources

Referenced by nodes (1)