cosine similarity
Also known as: cosine similarity score
Facts (15)
Sources
Detect hallucinations for RAG-based systems - AWS aws.amazon.com May 16, 2025 4 facts
formulaThe hallucination score in a semantic similarity-based detection system is calculated as 1 minus the cosine similarity score.
codeThe similarity_detector function in Python calculates semantic similarity between a RAG context and an LLM answer using BedrockEmbeddings and cosine similarity, returning a float score representing the difference between the two embeddings.
def similarity_detector(
context: str,
answer: str,
llm: BedrockEmbeddings,
) -> float:
if len(context) == 0 or len(answer) == 0:
return 0.0
# calculate embeddings
context_emb = llm.embed_query(context)
answer_emb = llm.embed_query(answer)
context_emb = np.array(context_emb).reshape(1, -1)
answer_emb = np.array(answer_emb).reshape(1, -1)
sim_score = cosine_similarity(context_emb, answer_emb)
return 1 - sim_score[0][0]
procedureSemantic similarity-based hallucination detection involves three steps: (1) create embeddings for the answer and the context using an LLM, (2) calculate cosine similarity scores between each sentence in the answer and the context, and (3) tune the decision threshold for a specific dataset to classify hallucinating statements.
formulaCosine similarity scores range from 0 to 1, where 1 represents perfect similarity and 0 represents no similarity.
A Knowledge Graph-Based Hallucination Benchmark for Evaluating ... arxiv.org Feb 23, 2026 3 facts
procedureThe entity-level filter evaluates semantic similarity using cosine similarity on encoded representations of the response and the entity description, and evaluates token-level similarity using the intersection of common words.
procedureThe methodology for hallucination detection uses cosine similarity to quantify the similarity between embedded response and description texts.
claimThe Fuzzy Set Ratio calculates similarity by measuring the intersection of common words between an LLM's response and an entity's description, making it effective for texts that vary in length and word order. It produces a percentage-based similarity score similar to cosine similarity.
Efficient Knowledge Graph Construction and Retrieval from ... - arXiv arxiv.org Aug 7, 2025 2 facts
referenceThe knowledge graph construction and retrieval system described in the arXiv paper 'Efficient Knowledge Graph Construction and Retrieval from ...' uses the Milvus vector database to store and retrieve both chunk and relation embeddings, which are then used to compute cosine similarity with a query.
procedureThe GraphRAG retrieval process uses a two-stage strategy: first, a high-recall one-hop graph traversal to identify candidate nodes, followed by a dense vector-based re-ranking step using OpenAI embeddings and cosine similarity to refine the results.
Medical Hallucination in Foundation Models and Their Impact on ... medrxiv.org Nov 2, 2025 1 fact
procedureThe Similarity Score assesses semantic similarity between a model's generated response and the ground truth answer, as well as between the response and the original question, using UMLSBERT and cosine similarity.
The construction and refined extraction techniques of knowledge ... nature.com Feb 10, 2026 1 fact
procedureThe BERTScore evaluation method proceeds in four steps: (1) map the words of the generated text and reference text to the embedding space to obtain word vectors, (2) calculate the cosine similarity for each word pair between the generated and reference texts to form a similarity matrix, (3) calculate Precision (P) as the average similarity of each word vector in the generated text to the most similar word vector in the reference text, and (4) calculate Recall (R) as the average similarity of each word vector in the reference text to the most similar word vector in the generated text.
Medical Hallucination in Foundation Models and Their ... medrxiv.org Mar 3, 2025 1 fact
procedureThe Med-HALT benchmark calculates cosine similarity between model output embeddings and two references: Answer Similarity (between the correct option and model output) and Question Similarity (between the original question and model output).
Knowledge Graph Combined with Retrieval-Augmented Generation ... drpress.org Dec 2, 2025 1 fact
referenceGunawan, Sembiring, and Budiman implemented cosine similarity to calculate text relevance between two documents, published in the Journal of Physics: Conference Series in 2018.
Benchmarking Hallucination Detection Methods in RAG - Cleanlab cleanlab.ai Sep 30, 2024 1 fact
formulaThe RAGAS Answer Relevancy metric is defined as the average semantic similarity between the original question and three LLM-generated questions from the answer, measured via cosine similarity between vector embeddings of each question.
How to Improve Multi-Hop Reasoning With Knowledge Graphs and ... neo4j.com Jun 18, 2025 1 fact
procedureThe process for preparing documents for retrieval-augmented generation (RAG) involves five steps: (1) Chunk the text by splitting documents into multiple chunks, (2) Generate embeddings by using a text embedding model to create vector representations of the text chunks, (3) Encode the user query by converting the input question into a vector at query time, (4) Perform similarity search by applying algorithms like cosine similarity to compare the distance between the user input vector and the embedded text chunks, and (5) Retrieve top matches by returning the most similar documents to provide context to the large language model.