reference
The paper 'Sparse autoencoders find highly interpretable features in language models' is an arXiv preprint (arXiv:2309.08600) regarding interpretability.
Authors
Sources
- A Survey on the Theory and Mechanism of Large Language Models arxiv.org via serper
Referenced by nodes (1)
- Language Model concept