Fact — claim — Knowledge Tree

When evaluating hallucination detection capabilities, GPT-4V and GPT-4O followed instructions well but incorrectly classified hallucination types in Large Vision-Language Model (LVLM) outputs, failing to recognize their errors even when prompted to explain their classifications.

Authors

Person: Not available Organization: arXiv
Detecting and Evaluating Medical Hallucinations in Large Vision ...

Sources

Detecting and Evaluating Medical Hallucinations in Large Vision ... arxiv.org arXiv via serper

Referenced by nodes (3)