claim
Xu and Sato (2025) assert that while latent thoughts support efficient parallel computation, discrete Chain-of-Thought (CoT) remains superior for tasks requiring stochastic decoding to approximate complex solutions.
Authors
Sources
- A Survey on the Theory and Mechanism of Large Language Models arxiv.org via serper
Referenced by nodes (1)
- chain-of-thought concept