claim
In the FinanceBench dataset, hallucinated responses often contain incorrect numerical values.
Authors
Sources
- Benchmarking Hallucination Detection Methods in RAG - Cleanlab cleanlab.ai via serper
Referenced by nodes (1)
- FinanceBench concept