Fact — claim — Knowledge Tree

The Cleanlab RAG benchmark uses OpenAI’s gpt-4o-mini LLM to power both the 'LLM-as-a-judge' and 'TLM' scoring methods.

Authors

Person: Not available Organization: Cleanlab
Real-Time Evaluation Models for RAG: Who Detects Hallucinations ...

Sources

Real-Time Evaluation Models for RAG: Who Detects Hallucinations ... cleanlab.ai Cleanlab via serper

Referenced by nodes (4)

LLM-as-a-judge concept
OpenAI entity
gpt-4o-mini concept
TLM concept