Fact — reference — Knowledge Tree

Kandpal et al. (2022) argue that data repetition is the primary driver of memorization that leads to privacy risks, and they demonstrated that re-training models on sequence-level deduplicated data significantly reduces these privacy risks.

Authors

Person: Not available Organization: arXiv
A Survey on the Theory and Mechanism of Large Language Models

Sources

A Survey on the Theory and Mechanism of Large Language Models arxiv.org arXiv via serper

Referenced by nodes (1)

memorization concept