claim
The non-universal scaling exponents in large language models are linked to the intrinsic dimension of the data manifold.

Authors

Sources

Referenced by nodes (1)