claim
In the Cleanlab RAG benchmark, a detector with a high AUROC score more consistently assigns lower scores to incorrect RAG responses than to correct ones.

Authors

Sources

Referenced by nodes (1)