claim
The Datadog hallucination detection method showed the smallest drop in F1 scores between HaluBench and RAGTruth, suggesting robustness as hallucinations become harder to detect.

Authors

Sources

Referenced by nodes (4)