claim
The pretraining error of Large Language Models (LLMs) is decomposed into generalization error and approximation error, where the generalization error is upper bounded via the PAC-Bayes framework.

Authors

Sources

Referenced by nodes (1)