claim
The ICR Score (Information Contribution to Residual Stream) and the ICR Probe are metrics used for reference-free hallucination detection that aggregate layer-wise residual updates, outperforming prior hidden-state baselines with a lightweight MLP.
Authors
Sources
- EdinburghNLP/awesome-hallucination-detection - GitHub github.com via serper
Referenced by nodes (1)
- hallucination detection concept