reference
The paper "Mitigating Hallucinations in Large Vision-Language Models via DPO: On-Policy Data Hold the Key" by Yang et al. (2025) argues that the use of on-policy data is critical for mitigating hallucinations in large vision-language models when using direct preference optimization.
Authors
Sources
- Awesome-Hallucination-Detection-and-Mitigation - GitHub github.com via serper
Referenced by nodes (3)
- Large Vision-Language Models concept
- hallucination mitigation concept
- Direct Preference Optimization (DPO) concept