claim
Zhao et al. (2025b) characterized Reinforcement Learning as an “echo chamber” that converges to a single dominant output format found in pre-training data, which suppresses diversity while enabling positive transfer from simple to complex tasks.
Authors
Sources
- A Survey on the Theory and Mechanism of Large Language Models arxiv.org via serper
Referenced by nodes (1)
- reinforcement learning concept