hallucination ↔ Large Vision-Language Models

Relations (1)

related 11.00 — strongly supporting 11 facts

Hallucination is a critical phenomenon in Large Vision-Language Models (LVLMs) defined as the generation of content inconsistent with visual inputs [1], which researchers are actively working to detect and mitigate {fact:1, fact:9, fact:11}. The susceptibility of these models to hallucinations is categorized into specific types [2] and is influenced by various causal pathways and architectural limitations {fact:2, fact:7}.

Facts (11)

Sources

Detecting and Evaluating Medical Hallucinations in Large Vision ... arxiv.org arXiv 9 facts

claimHallucination in Large Vision Language Models (LVLMs) is defined as the generation of descriptions that are inconsistent with relevant images and user instructions, containing incorrect objects, attributes, and relationships related to the visual input.

claimLarge Vision Language Models (LVLMs) inherit susceptibility to hallucinations from Large Language Models (LLMs), which poses significant risks in high-stakes medical contexts.

referenceThe paper 'Evaluation and analysis of hallucination in large vision-language models' by Junyang Wang, Yiyang Zhou, Guohai Xu, Pengcheng Shi, Chenlin Zhao, Haiyang Xu, Qinghao Ye, Ming Yan, Ji Zhang, Jihua Zhu, and colleagues, provides an evaluation and analysis of hallucination in large vision-language models.

claimAccuracy metrics for Large Vision-Language Models evaluate at a coarse semantic level and cannot distinguish between different degrees of hallucinations in the output.

claimGeneral Large Vision-Language Models (LVLMs) typically categorize hallucinations into three types: object hallucinations, attribute hallucinations, and relational hallucinations.

referenceXintong Wang et al. published 'Mitigating hallucinations in large vision-language models with instruction contrastive decoding' as an arXiv preprint in 2024.

claimIn Large Vision Language Models, the hallucination phenomenon is exacerbated by factors including a lack of visual feature extraction capability, misalignment of multimodal features, and the incorporation of additional information.

referenceWenyi Xiao et al. published 'Detecting and mitigating hallucination in large vision language models via fine-grained ai feedback' as an arXiv preprint in 2024.

referenceThe MediHall Score is a medical evaluative metric designed to assess Large Vision Language Models' hallucinations through a hierarchical scoring system that considers the severity and type of hallucination to enable granular assessment of clinical impacts.

EdinburghNLP/awesome-hallucination-detection - GitHub github.com GitHub 1 fact

claimLarge Vision-Language Model (LVLM) hallucinations originate from three interacting causal pathways: image-to-input-text, image-to-output-text, and text-to-text.

On Hallucinations in Artificial Intelligence–Generated Content ... jnm.snmjournals.org The Journal of Nuclear Medicine 1 fact

claimAutomatic hallucination detectors trained on benchmark datasets are being explored in large vision-language models to reduce the burden of human evaluation.