perspective
The authors of 'Re-evaluating Hallucination Detection in LLMs' argue that there is a need to evaluate Large Language Model responses against human-aligned metrics rather than ROUGE.
Authors
Sources
- Re-evaluating Hallucination Detection in LLMs - arXiv arxiv.org via serper
Referenced by nodes (1)
- ROUGE concept