reference
Sundararajan et al. (2017) proposed Integrated Gradients, an axiomatic attribution method that assigns a contribution score to each input feature for token-level importance analysis in neural NLP and LLM outputs.
Authors
Sources
- A Survey on the Theory and Mechanism of Large Language Models arxiv.org via serper
Referenced by nodes (1)
- LLM outputs concept