reference
The paper 'Do NOT think that much for 2+3=? On the overthinking of long reasoning models' was published in the Proceedings of the 42nd International Conference on Machine Learning, volume 267, pages 9487–9499, edited by A. Singh, M. Fazel, D. Hsu, S. Lacoste-Julien, F. Berkenkamp, T. Maharaj, K. Wagstaff, and J. Zhu.
Authors
Sources
- A Survey on the Theory and Mechanism of Large Language Models arxiv.org via serper