natural language inference (NLI)
Also known as: NLI, natural language inference (NLI), natural-language inference, Natural Language Inference (NLI) classifiers
Facts (23)
Sources
Daily Papers - Hugging Face huggingface.co Mar 20, 2026 5 facts
procedureThe 'decompose-and-formalise' framework addresses scaling and refinement issues in natural language inference by: (i) decomposing premise-hypothesis pairs into an entailment tree of atomic steps, (ii) verifying the tree bottom-up to isolate failures to specific nodes, and (iii) performing local diagnostic-guided refinement instead of regenerating the whole explanation.
claimThe 'decompose-and-formalise' framework reduces refinement iterations and runtime while preserving strong natural language inference accuracy.
claimCurrent methods for natural language inference often handle failures via costly global regeneration because it is difficult to localise the responsible span or step from prover diagnostics.
claimScaling refinement of natural language inference to naturalistic inputs is difficult because long, syntactically rich inputs and deep multi-step arguments amplify autoformalisation errors, where a single local mismatch can invalidate the proof.
claimIntegrating large language models with theorem provers in neuro-symbolic pipelines assists with entailment verification and proof-guided refinement of explanations for natural language inference.
Medical Hallucination in Foundation Models and Their ... medrxiv.org Mar 3, 2025 3 facts
claimNatural Language Inference (NLI) classifiers can be fine-tuned on medical literature and clinical guidelines to improve hallucination detection in medical AI systems.
procedureConsistency Analysis uses Natural Language Inference (NLI) and Question-Answer Consistency techniques to detect Decision-Making Hallucinations and Diagnostic Hallucinations in Clinical Decision Support Systems (CDSS) and Electronic Health Record (EHR) Management.
claimEntailment-based methods for summary consistency use natural language inference to determine whether each sentence in a summary is logically entailed by the source.
A Knowledge-Graph Based LLM Hallucination Evaluation Framework themoonlight.io 2 facts
procedureThe GraphEval framework detects hallucinations by using a pretrained Natural Language Inference (NLI) model to compare each triple in the constructed Knowledge Graph against the original context, flagging a triple as a hallucination if the NLI model predicts inconsistency with a probability score greater than 0.5.
claimGraphEval improves balanced accuracy in hallucination detection when used with various Natural Language Inference (NLI) models.
A Knowledge Graph-Based Hallucination Benchmark for Evaluating ... arxiv.org Feb 23, 2026 2 facts
procedureThe NLI Entailment Filter in the KGHaluBench pipeline utilizes a Natural Language Inference (NLI) model to classify the relationship between a reformatted fact and an LLM's response as entailment, contradiction, or neutral.
referenceThe FEVER benchmark, introduced by Thorne et al. in 2018, utilizes a Natural Language Inference (NLI) model to evaluate whether a response contains, contradicts, or does not mention the provided evidence.
EdinburghNLP/awesome-hallucination-detection - GitHub github.com 2 facts
claimHallucination detection metrics measure either the degree of hallucination in generated responses relative to given knowledge or their overlap with gold faithful responses, including Critic, Q² (F1, NLI), BERTScore, F1, BLEU, and ROUGE.
referenceNatural Language Inference (NLI)-based evaluation metrics for hallucination detection operate on the principle that a faithful and hallucination-free generation must be entirely entailed by the source knowledge reference.
New tool, dataset help detect hallucinations in large language models amazon.science 1 fact
claimRefChecker categorizes claims into three types based on their relationship to reference texts: entailments (supported), contradictions (refuted), and neutral (insufficient evidence). This aligns with the support, refute, and not enough information categories used in natural-language inference (NLI).
Re-evaluating Hallucination Detection in LLMs - arXiv arxiv.org Aug 13, 2025 1 fact
referenceLaban et al. (2022) introduced 'SummaC', a model that revisits Natural Language Inference (NLI) based models for detecting inconsistency in summarization tasks.
Survey and analysis of hallucinations in large language models frontiersin.org Sep 29, 2025 1 fact
referenceMaynez et al. (2020) proposed factuality scoring based on semantic entailment or natural language inference (NLI) to detect hallucinations.
Unlocking the Potential of Generative AI through Neuro-Symbolic ... arxiv.org Feb 16, 2025 1 fact
referenceZeming Chen, Qiyue Gao, and Lawrence S. Moss developed NeuralLog, a system for natural language inference using joint neural and logical reasoning, published as an arXiv preprint in 2021.
Automating hallucination detection with chain-of-thought reasoning amazon.science 1 fact
claimHalluMeasure classifies LLM hallucinations using a novel set of error types derived from linguistic patterns, which goes beyond binary classification or standard natural-language-inference (NLI) categories like support, refute, and not enough information.
Medical Hallucination in Foundation Models and Their Impact on ... medrxiv.org Nov 2, 2025 1 fact
claimEntailment-based methods use natural language inference to determine whether each sentence in a summary is logically entailed by the source text.
A survey on augmenting knowledge graphs (KGs) with large ... link.springer.com Nov 4, 2024 1 fact
claimXNLI (Cross-lingual Natural Language Inference) is a benchmark for evaluating cross-lingual language understanding that tests the ability of models to perform natural language inference across multiple languages.
Construction of intelligent decision support systems through ... - Nature nature.com Oct 10, 2025 1 fact
claimThe framework proposed in the Nature article encourages a nexus between knowledge representation and natural language inference.
A Knowledge-Graph Based LLM Hallucination Evaluation Framework arxiv.org Jul 15, 2024 1 fact
measurementUsing GraphEval in conjunction with state-of-the-art natural language inference (NLI) models improves balanced accuracy on various hallucination benchmarks compared to using raw NLI models alone.