measurement
In the CovidQA dataset application, RAGAS Faithfulness performs relatively well for hallucination detection but remains less effective than the Trustworthy Language Model (TLM).
Authors
Sources
- Benchmarking Hallucination Detection Methods in RAG - Cleanlab cleanlab.ai via serper
Referenced by nodes (2)
- CovidQA concept
- Trustworthy Language Model concept