concept

semantic entropy

Also known as: SemEntropy, semantic entropy-based method

Facts (17)

Sources
Re-evaluating Hallucination Detection in LLMs - arXiv arxiv.org arXiv Aug 13, 2025 5 facts
claimAmong the evaluated hallucination detection techniques, Semantic Entropy maintains a degree of relative stability, exhibiting more modest performance variations between ROUGE and LLM-as-Judge evaluation frameworks.
referenceThe paper 'Detecting hallucinations in large language models using semantic entropy' by Farquhar et al. (2024) proposes a method for identifying hallucinations in large language models using semantic entropy, published in Nature.
referenceUncertainty-based methods for hallucination detection in large language models include Perplexity (Ren et al., 2023), Length-Normalized Entropy (LN-Entropy) (Malinin and Gales, 2021), and Semantic Entropy (SemEntropy) (Farquhar et al., 2024), which utilize multiple generations to capture sequence-level uncertainty.
claimSemantic Entropy maintains the most consistent performance across both zero-shot and few-shot settings, while traditional metrics like Perplexity and LN-Entropy show higher sensitivity to setting changes.
claimSimple length-based heuristics, such as the mean and standard deviation of answer length, rival or exceed the performance of sophisticated hallucination detectors like Semantic Entropy.
Medical Hallucination in Foundation Models and Their ... medrxiv.org medRxiv Mar 3, 2025 4 facts
referenceHou et al. (2024) developed a semantic entropy-based method that analyzes how an AI model responds to different versions of the same question to distinguish between uncertainty caused by unclear question phrasings and uncertainty due to the model’s own knowledge gaps.
procedureUncertainty Quantification uses sequence log-probability and semantic entropy measures to identify potential areas of Clinical Data Fabrication and Procedure Description Errors in Large Language Models.
referenceFarquhar et al. (2024) proposed a semantic entropy-based method for hallucination detection that clusters AI model outputs by semantic meaning rather than surface-level differences to reduce inflated uncertainty caused by rephrasings.
claimHigh uncertainty in a Large Language Model's outputs, indicated by low sequence probabilities or high semantic entropy, suggests the model is generating content without strong grounding in its training data, as noted by Asgari et al. (2024) and Vishwanath et al. (2024).
EdinburghNLP/awesome-hallucination-detection - GitHub github.com GitHub 3 facts
claimSimple length-based heuristics can match or exceed the performance of sophisticated hallucination detectors like Semantic Entropy.
claimMetrics used for hallucination detection include SelfCheckGPT, FactScore, EigenScore, Efficient EigenScore (EES), Semantic Entropy, Perplexity, HaluEval Accuracy, and ROUGE-1 (XSum).
measurementThe lightweight probe method for hallucination detection outperforms HaloScope and Semantic Entropy on 10 of 12 model–dataset combinations, achieving up to 13-point AUROC gains.
Medical Hallucination in Foundation Models and Their Impact on ... medrxiv.org medRxiv Nov 2, 2025 2 facts
claimUncertainty-based hallucination detection methods rely on either sequence log-probability or semantic entropy to quantify uncertainty.
claimSequence probability and semantic entropy are complementary methods for hallucination detection, where sequence log-probabilities provide a token-level uncertainty measure and semantic entropy captures the stability of the underlying meaning.
LLM Hallucination Detection and Mitigation: State of the Art in 2026 zylos.ai Zylos Jan 27, 2026 1 fact
claimSemantic entropy, PCC (Predictive Consistency Check), and mechanistic interpretability are considered cutting-edge advances in hallucination detection.
A framework to assess clinical safety and hallucination rates of LLMs ... nature.com Nature May 13, 2025 1 fact
referenceFarquhar et al. (2024) proposed using semantic entropy as a method for detecting hallucinations in large language models, published in Nature.
Detecting hallucinations with LLM-as-a-judge: Prompt ... - Datadog datadoghq.com Aritra Biswas, Noé Vernier · Datadog Aug 25, 2025 1 fact
referenceJi, Z. et al. (2023) published 'Detecting hallucinations in large language models using semantic entropy' in Nature.