claim
Zakerinia et al. (2025) propose that the strong generalization of highly overparameterized deep models can be explained by low intrinsic dimensionality from a multi-task learning perspective, where the learning process is confined to a low-dimensional manifold despite the vast number of parameters.
Authors
Sources
- A Survey on the Theory and Mechanism of Large Language Models arxiv.org via serper
Referenced by nodes (1)
- multi-task learning concept