claim
The max_new_tokens parameter controls sequence length in large language models, and longer generations face higher cumulative exposure bias divergence, which increases hallucination risk as the sequence grows.

Authors

Sources

Referenced by nodes (3)