claim
Temperature scaling in Large Language Models modifies the token probability distribution before sampling occurs.

Authors

Sources

Referenced by nodes (1)