claim
Large language models trained on supervised finetuning data learn the style of confident, well-structured prose because human annotators tend to produce such responses when demonstrating ideal answers.

Authors

Sources

Referenced by nodes (2)