reference
Instruction tuning and reinforcement learning from human feedback (RLHF) improve prompt responsiveness but do not eliminate deep-seated model hallucinations, as noted by Ouyang et al. (2022) and Kadavath et al. (2022).

Authors

Sources

Referenced by nodes (2)