claim
LLM-based evaluation, particularly using GPT-4, yields the best overall results for detecting hallucinations in language models.

Authors

Sources

Referenced by nodes (1)