claim
Embedding similarity metrics for RAG evaluation are deterministic and cheap but rigid because they reward matching the ground truth rather than actual correctness, and improvements can appear worse if the ground truth is narrow.
Authors
Sources
- RAG Hallucinations: Retrieval Success ≠ Generation Accuracy www.linkedin.com via serper
Referenced by nodes (2)
- ground truth concept
- RAG evaluation concept