Fact — reference — Knowledge Tree

The paper 'Training nonlinear transformers for chain-of-thought inference: a theoretical generalization analysis' provides a theoretical analysis of how nonlinear transformers generalize when trained for chain-of-thought inference.

Authors

Person: Not available Organization: arXiv
A Survey on the Theory and Mechanism of Large Language Models

Sources

A Survey on the Theory and Mechanism of Large Language Models arxiv.org arXiv via serper

Referenced by nodes (2)

chain-of-thought concept
generalization concept