claim
Human annotators rating large language model responses during instruction tuning and RLHF tend to prefer responses that sound knowledgeable and direct over responses that sound uncertain and hedged.
Authors
Sources
- Hallucination Causes: Why Language Models Fabricate Facts mbrenndoerfer.com via serper
Referenced by nodes (2)
- instruction tuning concept
- RLHF concept