claim
Numerical experiments conducted by Yan Yang, Bin Gao, and Ya-xiang Yuan demonstrate that the hyper-gradient serves as an integration of exploitation and exploration in reinforcement learning.
Authors
Sources
- Track: Poster Session 3 - aistats 2026 virtual.aistats.org via serper
Referenced by nodes (1)
- reinforcement learning concept