measurement
The Cleanlab RAG benchmark quantifies the effectiveness of detection methods using the Area under the Receiver Operating Characteristic curve (AUROC).
Authors
Sources
- Real-Time Evaluation Models for RAG: Who Detects Hallucinations ... cleanlab.ai via serper