Fact — formula — Knowledge Tree

The Cleanlab benchmark evaluates hallucination detectors based on AUROC, defined as the probability that the detector's score will be lower for an example where the LLM responded incorrectly than for an example where the LLM responded correctly.

Authors

Person: Not available Organization: Cleanlab
Benchmarking Hallucination Detection Methods in RAG - Cleanlab

Sources

Benchmarking Hallucination Detection Methods in RAG - Cleanlab cleanlab.ai Cleanlab via serper

Referenced by nodes (2)

Cleanlab entity
AUROC concept