procedure
The BAFH framework is a lightweight method that trains a feedforward classifier on hidden states of Large Language Models to determine belief states and classify hallucination types, as evaluated against MIND and SAR baselines using Gemma-2, Llama-3.1, and Mistral models.
Authors
Sources
- EdinburghNLP/awesome-hallucination-detection - GitHub github.com via serper
Referenced by nodes (4)
- Large Language Models concept
- LLaMA concept
- mind concept
- Mistral AI entity