claim
While researchers often prefer Adam over SGD in adversarial neural networks and reinforcement learning due to faster practical convergence, there is no definitive theoretical proof establishing that Adam is superior to SGD.
Authors
Sources
- A Survey on the Theory and Mechanism of Large Language Models arxiv.org via serper
Referenced by nodes (1)
- reinforcement learning concept