claim
Prompts are used to augment labeled data with reasoning chains for supervised fine-tuning (SFT) or in SFT initialization steps before reinforcement learning (RL).

Authors

Sources

Referenced by nodes (3)