claim
PHANTOM is a benchmark dataset designed for evaluating hallucination detection in long-context financial question answering.
Authors
Sources
- A Benchmark for Hallucination Detection in Financial Long-Context QA neurips.cc via serper
Referenced by nodes (1)
- hallucination detection concept