claim
The BLEU metric scores zero when there are no shared n-grams or subsequences between a model's generated response and the ground truth, even if the model's answer is semantically correct.
Authors
Sources
- Detecting and Evaluating Medical Hallucinations in Large Vision ... arxiv.org via serper
Referenced by nodes (1)
- BLEU concept