claim
Reinforcement learning provides the most robust defense against modality conflict in Multimodal Large Language Models by training the model to prioritize visual evidence over misleading textual cues, compared to prompt engineering and supervised fine-tuning.
Authors
Sources
- EdinburghNLP/awesome-hallucination-detection - GitHub github.com via serper
Referenced by nodes (2)
- reinforcement learning concept
- Multimodal Large Language Models concept