claim
An evaluation method based on 'LLM-as-Judge' demonstrates closer agreement with human assessments of factual correctness compared to ROUGE, according to Thakur et al. (2025).

Authors

Sources

Referenced by nodes (3)