measurement
Tokens per second throughput is a metric used to measure the performance and response speed of LLMs.
Authors
Sources
- LLM Observability: How to Monitor AI When It Thinks in Tokens | TTMS ttms.com via serper
Referenced by nodes (1)
- Large Language Models concept