measurement
The Med-HallMark benchmark evaluates AI models on hallucination detection using the MediHall Score and traditional metrics including BertScore, METEOR, ROUGE-1, ROUGE-2, ROUGE-L, and BLEU.

Authors

Sources

Referenced by nodes (3)