reference
The paper 'SoftDedup: an efficient data reweighting method for speeding up language model pre-training' was published in the Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) in Bangkok, Thailand, pages 4011–4022.
Authors
Sources
- A Survey on the Theory and Mechanism of Large Language Models arxiv.org via serper