claim
Performance gains in large language models are achieved not only by scaling data and model size during training, but also by increasing test-time computation, such as allowing the model to perform recurrent or iterative reasoning.

Authors

Sources

Referenced by nodes (2)