claim
Efforts to mitigate hallucinations at the model level include supervised fine-tuning (SFT), reinforcement learning from human feedback (RLHF), contrastive decoding, and grounded pretraining.
Authors
Sources
- Survey and analysis of hallucinations in large language models www.frontiersin.org via serper
Referenced by nodes (3)
- Reinforcement learning from human feedback (RLHF) concept
- supervised fine-tuning concept
- RLHF concept