measurement
In a retrieval-augmented generation (RAG) system, traces can reveal that 80% of total latency is spent on document retrieval rather than model inference.

Authors

Sources

Referenced by nodes (2)