claim
Pan et al. (2025c) use the Kolmogorov Structure Function to demonstrate that large language models learn syntactic patterns first and factual knowledge according to frequency, connecting model capacity and data size to scaling laws.

Authors

Sources

Referenced by nodes (1)