Fact — reference — Knowledge Tree

HaluBench is a partially synthetic hallucination benchmarking dataset where negative examples (non-hallucinated answers) are derived from existing question answering benchmarks including HaluEval, DROP, CovidQA, FinanceBench, and PubMedQA.

Authors

Person: Aritra Biswas, Noé Vernier Organization: Datadog
Detecting hallucinations with LLM-as-a-judge: Prompt ... - Datadog

Sources

Detecting hallucinations with LLM-as-a-judge: Prompt ... - Datadog www.datadoghq.com Aritra Biswas, Noé Vernier · Datadog via serper

Referenced by nodes (5)

CovidQA concept
DROP concept
PubmedQA concept
HaluEval concept
FinanceBench concept