reference
Instruction tuning and reinforcement learning from human feedback (RLHF) improve prompt responsiveness but do not eliminate deep-seated model hallucinations, as noted by Ouyang et al. (2022) and Kadavath et al. (2022).
Authors
Sources
- Survey and analysis of hallucinations in large language models www.frontiersin.org via serper
Referenced by nodes (2)
- Reinforcement learning from human feedback (RLHF) concept
- instruction tuning concept