claim
The temperature parameter in large language models scales the logit distribution before sampling; higher values flatten the distribution and increase hallucination risk, while lower values sharpen the distribution toward the most probable tokens.

Authors

Sources

Referenced by nodes (2)