procedure
The BAFH framework is a lightweight method that trains a feedforward classifier on hidden states of Large Language Models to determine belief states and classify hallucination types, as evaluated against MIND and SAR baselines using Gemma-2, Llama-3.1, and Mistral models.

Authors

Sources

Referenced by nodes (4)