procedure
A lightweight classifier method for hallucination detection conditions on input hidden states before text generation and intervenes in these states to steer Large Language Models toward factual outputs, resulting in consistent improvements in factual accuracy with minimal computational overhead. This method uses Accuracy as a metric and is evaluated on the NQ-Open, MMLU, MedMCQA, and GSM8K datasets.

Authors

Sources

Referenced by nodes (3)