claim
Monitoring latency alongside output quality helps identify the optimal performance balance for LLMs, as slight delays may indicate the model is performing more reasoning.

Authors

Sources

Referenced by nodes (3)