reference
The paper 'Training nonlinear transformers for chain-of-thought inference: a theoretical generalization analysis' provides a theoretical analysis of how nonlinear transformers generalize when trained for chain-of-thought inference.

Authors

Sources

Referenced by nodes (2)