claim
Evaluation techniques such as 'LLM-as-a-judge' or 'TLM' (Trustworthy Language Model) can be powered by any Large Language Model and do not require specific data preparation, labeling, or custom model infrastructure, provided the user has infrastructure to run pre-trained LLMs like AWS Bedrock, Azure/OpenAI, Gemini, or Together.ai.
Authors
Sources
- Real-Time Evaluation Models for RAG: Who Detects Hallucinations ... cleanlab.ai via serper
Referenced by nodes (5)
- LLM-as-a-judge concept
- OpenAI entity
- Gemini concept
- Trustworthy Language Model concept
- TLM concept