reference
The research paper 'ProRL: prolonged reinforcement learning expands reasoning boundaries in large language models' was published in the International Conference on Machine Learning, pp. 4051–4060, and cited in section 7.2.2 of the survey.
Authors
Sources
- A Survey on the Theory and Mechanism of Large Language Models arxiv.org via serper
Referenced by nodes (3)
- Large Language Models concept
- reinforcement learning concept
- International Conference on Machine Learning entity