measurement
The default version of the RAGAS Faithfulness metric failed to produce a score for 83.5% of the examples in the FinanceBench dataset.
Authors
Sources
- Benchmarking Hallucination Detection Methods in RAG - Cleanlab cleanlab.ai via serper
Referenced by nodes (1)
- FinanceBench concept