reference
The paper 'SFT memorizes, RL generalizes: a comparative study of foundation model post-training' was published in the Proceedings of the 42nd International Conference on Machine Learning, Vol. 267, pp. 10818–10838.
Authors
Sources
- A Survey on the Theory and Mechanism of Large Language Models arxiv.org via serper
Referenced by nodes (3)
- reinforcement learning concept
- foundation models concept
- International Conference on Machine Learning event