measurement
The performance metrics for hallucination detection techniques, averaged over Wikipedia and generative AI synthetic datasets, are as follows: Token Similarity Detector (Accuracy: 0.47, Precision: 0.96, Recall: 0.03, Cost: 0, Explainability: Yes); Semantic Similarity Detector (Accuracy: 0.48, Precision: 0.90, Recall: 0.02, Cost: K sentences, Explainability: Yes); LLM Prompt-Based Detector (Accuracy: 0.75, Precision: 0.94, Recall: 0.53, Cost: 1, Explainability: Yes); BERT Stochastic Checker (Accuracy: 0.76, Precision: 0.72, Recall: 0.90, Cost: N+1 samples, Explainability: Yes).
Authors
Sources
- Detect hallucinations for RAG-based systems - AWS aws.amazon.com via serper
Referenced by nodes (1)
- hallucination detection concept