Fact — formula — Knowledge Tree

Li et al. (2024b) demonstrated that constant-depth Transformers without Chain-of-Thought (CoT) are restricted to parallelizable complexity classes such as AC0 or NC1, while the addition of reasoning steps enables the model to solve any problem within the complexity class P.

Authors

Person: Not available Organization: arXiv
A Survey on the Theory and Mechanism of Large Language Models

Sources

A Survey on the Theory and Mechanism of Large Language Models arxiv.org arXiv via serper

Referenced by nodes (1)

chain-of-thought concept