Large Language Models ↔ hallucination

Relations (1)

related 7.32 — strongly supporting 159 facts

Hallucination is a well-documented phenomenon where Large Language Models generate false or nonsensical information that appears plausible [1], [2], [3]. It is considered a primary obstacle to the adoption of these models [4], leading to extensive research into detection metrics [5], [6], [7] and mitigation strategies such as Retrieval-Augmented Generation and knowledge graph integration [8], [9].

Facts (159)

Sources

Survey and analysis of hallucinations in large language models frontiersin.org Frontiers 25 facts

referenceWu et al. (2023) introduced 'HallucinationEval,' a unified framework designed for evaluating hallucinations in large language models.

claimHallucinations in Large Language Models are categorized into two primary sources: prompting-induced hallucinations caused by ill-structured or misleading prompts, and model-internal hallucinations caused by architecture, pretraining data distribution, or inference behavior.

formulaHallucination events in Large Language Models can be represented probabilistically as random events, where H denotes hallucination occurrence conditioned upon prompting strategy P and model characteristics M, expressed as P(P, M|H) = (P(H|P, M) * P(P, M)) / P(H).

claimThe paper 'Survey and analysis of hallucinations in large language models: attribution to prompting strategies or model behavior' was published in Frontiers in Artificial Intelligence on September 30, 2025, by authors Anh-Hoang D, Tran V, and Nguyen L-M.

claimIntrinsic factors within model architecture, training data quality, and sampling algorithms significantly contribute to hallucination problems in large language models.

procedureQuantifying hallucinations in large language models involves using targeted metrics such as accuracy-based evaluations on question-answering tasks, entropy-based measures of semantic coherence, and consistency checking against external knowledge bases.

claimHallucinations in Large Language Models negatively impact the reliability and efficiency of AI systems in high-impact domains such as medicine (Lee et al., 2023), law (Bommarito and Katz, 2022), journalism (Andrews et al., 2023), and scientific communication (Nakano et al., 2021; Liu et al., 2023).

claimHallucinations in large language models arise from both prompt-dependent factors and model-intrinsic factors, which requires the use of tailored mitigation approaches.

procedureMitigation strategies for large language model hallucinations at the modeling level include Reinforcement Learning from Human Feedback (RLHF) (Ouyang et al., 2022), retrieval fusion (Lewis et al., 2020), and instruction tuning (Wang et al., 2022).

claimChain-of-Thought prompting and Instruction-based inputs are effective for mitigating hallucinations in Large Language Models but are insufficient in isolation.

claimAttribution-based metrics, specifically PS and MV, provide a novel method for classifying and addressing the sources of hallucinations in large language models.

procedureMitigation strategies for large language model hallucinations at the prompting level include prompt calibration, system message design, and output verification loops.

claimHallucination in Large Language Models refers to outputs that appear fluent and coherent but are factually incorrect, logically inconsistent, or entirely fabricated.

claimThe attribution framework categorizes hallucinations in Large Language Models into four types: prompt-dominant, model-dominant, mixed-origin, or unclassified.

claimSome hallucinations in Large Language Models persist regardless of prompting structure, indicating inherent model biases or training artifacts, as observed in the DeepSeek model.

claimHallucinations in Large Language Models (LLMs) are categorized into two dimensions: prompt-level issues and model-level behaviors.

claimMitigation strategies for hallucinations in large language models are categorized into two types: prompt-based interventions and model-based architectural or training improvements.

referenceHallucinationEval (Wu et al., 2023) provides a framework for measuring different types of hallucinations in large language models.

claimHallucinations in Large Language Models create risks for misinformation, reduced user trust, and accountability gaps (Bommasani et al., 2021; Weidinger et al., 2022).

referenceRealToxicityPrompts (Gehman et al., 2020) is a benchmark used to investigate how large language models hallucinate toxic or inappropriate content.

claimHallucination in large language models is linked to pretraining biases and architectural limits, according to research by Kadavath et al. (2022), Bang and Madotto (2023), and Chen et al. (2023).

perspectiveMitigation of hallucinations in Large Language Models requires multi-layered, attribution-aware pipelines, as no single approach can entirely eliminate the phenomenon.

claimGrounded pretraining reduces hallucination during generation in large language models, though it requires significant data and compute resources.

claimHallucinations in Large Language Models occur when the probabilistic model incorrectly favors a hallucinatory output (yhalluc) over a factually correct response (yfact), representing a mismatch between the model's internal probability distributions and real-world factual distributions.

claimThere is currently no widely acceptable metric or dataset that fully captures the multidimensional nature of hallucinations in Large Language Models.

Hallucination Causes: Why Language Models Fabricate Facts mbrenndoerfer.com M. Brenndoerfer · mbrenndoerfer.com 21 facts

claimLarge language models tend to produce hallucinations that are fluent, internally consistent, and superficially plausible, which makes them dangerous for users unable to independently verify the claims.

claimScaling up large language models increases the fluency and coherence of generated text, which makes hallucinations more convincing and harder to detect.

claimEvaluating large language models for hallucinations separately from general capabilities is essential, and metrics should account for the deceptiveness of errors rather than just their frequency to capture practical risk.

claimThe properties that make large language models useful—fluent, coherent, and confident generation—are the same properties that make their hallucinations more harmful.

claimLarge language models that are better at following instructions and producing fluent prose may hallucinate at similar rates as simpler models on tail entities, but produce more convincing hallucinations.

claimWhen large language models are asked about obscure entities, they often generate plausible-sounding facts based on the types of information typically associated with that entity category, even though the specific facts are not grounded in actual knowledge.

claimThe frequency with which an entity is mentioned in training documents is a less accurate predictor of hallucination risk than the frequency with which specific facts about that entity are stated, verified, and contextualized.

claimThe temperature parameter in large language models scales the logit distribution before sampling; higher values flatten the distribution and increase hallucination risk, while lower values sharpen the distribution toward the most probable tokens.

claimHallucination in large language models is a structural issue originating from how training data is collected, how the optimization objective is constructed, the limitations of what knowledge the model can represent, and how the generation process converts probability distributions into words.

claimThe interaction of hallucination causes in large language models is sensitive to model scale in non-intuitive ways.

claimHallucination rates in large language models are not uniform across a response, tending to cluster in the later sections of long responses rather than appearing uniformly throughout.

claimThe top_k parameter limits the number of candidate tokens at each generation step in large language models, and lower values reduce but do not eliminate hallucination risk.

claimHallucination in large language models is a structural consequence of how models are trained and how they generate text, rather than a random failure mode.

claimHallucinations involving common facts in large language models involve contradicting a strong, highly consistent statistical pattern, whereas hallucinations involving obscure facts involve filling a gap in a weak statistical pattern.

claimExposure bias is a cause of hallucination in large language models that arises from a mismatch between training efficiency and inference realism.

claimExposure bias in large language models does not require the model to lack the correct answer; rather, hallucinations arise because an error changes the input distribution, activating incorrect associations despite the model potentially possessing reliable knowledge.

claimThe max_new_tokens parameter controls sequence length in large language models, and longer generations face higher cumulative exposure bias divergence, which increases hallucination risk as the sequence grows.

claimThe causes of hallucinations in large language models interact and amplify each other.

claimHallucinating common facts in large language models represents a different failure mode than hallucinating obscure facts, such as the publication year of a niche scientific paper.

claimThe generation process in large language models introduces pressure to favor fluent hallucination over honest uncertainty because the process is a sequence of probability distributions where the model must select a token at each step, and the model lacks a mechanism to output 'I don't know'.

claimLarge language models exhibit a 3% floor of irreducible hallucination even at high training frequencies, which is caused by exposure bias, completion pressure, and conflicting signals in training data.

The Role of Hallucinations in Large Language Models - CloudThat cloudthat.com CloudThat 9 facts

claimLarge language models generate hallucinations when they produce outputs that are fictitious, incorrect despite sounding plausible, or inconsistent with the input prompt or grounding data.

claimToken pressure causes large language models to hallucinate because, when forced to generate long or elaborate responses, the model may invent details to maintain fluency and coherence.

claimLarge language models hallucinate because they are trained to predict the next token based on statistical patterns in language rather than to verify facts.

claimPrompt ambiguity causes large language models to hallucinate because vague or poorly structured prompts provide unclear instructions or lack constraints.

claimA lack of grounding causes large language models to hallucinate because, without external data sources, models rely solely on learned knowledge and may fabricate content when asked about obscure or domain-specific topics.

claimOver-generalization causes large language models to hallucinate because models compress vast knowledge into parameters, which can lead to the loss or inaccurate approximation of nuance and detail.

claimHallucinations in large language models can serve as a creative asset in contexts such as creative writing, brainstorming, roleplaying, prototype generation, and art or music creation.

claimTechniques such as Retrieval-Augmented Generation (RAG), fact-checking pipelines, and improved prompting can significantly reduce, though not completely prevent, hallucinations in large language models.

claimHallucinations in large language models pose risks in high-stakes domains, such as misdiagnosing conditions in healthcare, fabricating legal precedents, generating fake market data in finance, and providing incorrect facts in education.

Medical Hallucination in Foundation Models and Their ... medrxiv.org medRxiv 8 facts

claimMedical text contains ambiguous abbreviations, such as 'BP' which can refer to either 'blood pressure' or 'biopsy,' leading to potential misinterpretations and hallucinations in Large Language Models.

claimSvenstrup et al. (2015) observe that Large Language Models often lack exposure to rare diseases during training, which leads to hallucinations when the models generate diagnostic insights.

claimThe authors surveyed clinicians to gain insights into how medical professionals perceive and experience hallucinations when using Large Language Models for practice or research.

claimThe integration of knowledge graphs into Large Language Models helps mitigate hallucinations, which are instances where models generate plausible but incorrect information, according to Lavrinovics et al. (2024).

claimHallucinations in Large Language Models occur when the model generates outputs that are unsupported by factual knowledge or the input context.

claimHallucinations in Large Language Models occur when models generate outputs that sound plausible but lack logical coherence.

claimHallucination or confabulation in Large Language Models is a concern across various domains, including finance, legal, code generation, and education.

measurementThe study evaluated hallucination rates and clinical risk severity for five Large Language Models: o1, gemini-2.0-flash-exp, gpt-4o, gemini-1.5-flash, and claude-3.5 sonnet.

Re-evaluating Hallucination Detection in LLMs - arXiv arxiv.org arXiv 6 facts

claimThe Std-Len metric is effective at identifying hallucinations in Large Language Models because response length variability is a key indicator of hallucination.

referenceLi et al. (2023) created 'HaluEval', a large-scale benchmark for evaluating hallucinations in Large Language Models.

claimResponse length alone serves as a powerful signal for detecting hallucinations in Large Language Models.

claimHallucinations in Large Language Models are considered inevitable according to research by Xu et al. (2024).

claimUnsupervised methods for detecting hallucinations in large language models estimate uncertainty using token-level confidence from single generations, sequence-level variance across multiple samples, or hidden-state pattern analysis.

referenceZiwei Xu, Sanjay Jain, and Mohan Kankanhalli argued in their 2024 paper 'Hallucination is Inevitable: An Innate Limitation of Large Language Models' that hallucinations are an inherent limitation of large language models.

A Knowledge Graph-Based Hallucination Benchmark for Evaluating ... arxiv.org arXiv 5 facts

referenceKG-fpq is a framework for evaluating factuality hallucination in large language models using knowledge graph-based false premise questions.

claimThe paper 'Hallucination is inevitable: an innate limitation of large language models' asserts that hallucination is an innate limitation of large language models.

reference'Siren’s song in the ai ocean: a survey on hallucination in large language models' is a survey paper regarding hallucination in large language models.

referenceThe paper 'Why language models hallucinate' investigates the causes of hallucinations in large language models.

claimThe authors conducted an experiment using 25 open-source and proprietary LLMs to identify factors in LLM knowledge that may cause hallucinations.

[Literature Review] MedHallu: A Comprehensive Benchmark for ... themoonlight.io The Moonlight 4 facts

claimThe MedHallu benchmark provides a framework for evaluating hallucination prevalence and detection capabilities in medical applications of large language models, emphasizing the need for human oversight for patient safety.

claimHarder-to-detect hallucinations are semantically closer to the ground truth, which causes large language models to struggle more with identifying subtly incorrect information.

claimThe MedHallu benchmark defines hallucination in large language models as instances where a model produces information that is plausible but factually incorrect.

claimThe MedHallu study observes that detection difficulty varies by hallucination type, with 'Incomplete Information' being identified as a particularly challenging category for large language models.

Practices, opportunities and challenges in the fusion of knowledge ... frontiersin.org Frontiers 4 facts

referenceZhang et al. (2024b) conducted experiments on six main Large Language Models using the CoderEval dataset to analyze the distribution and nature of hallucination phenomena.

claimLarge Language Models (LLMs) frequently struggle to retrieve facts accurately, leading to the phenomenon known as hallucination, where models generate responses that sound plausible but are factually incorrect.

claimLarge language models suffer from a lack of explicit knowledge structure leading to hallucinations, high computational and data intensity, limited interpretability, difficulty with complex multi-step logic, and potential for bias and ethical concerns.

claimMindmap, ChatRule, and COK externalize structured knowledge or human-defined rules into prompt representations, which enables large language models to reason over complex graph-based scenarios with improved contextual grounding and reduced hallucinations.

Awesome-Hallucination-Detection-and-Mitigation - GitHub github.com GitHub 4 facts

referenceThe paper "MultiHal: Multilingual Dataset for Knowledge-Graph Grounded Evaluation of LLM Hallucinations" by Lavrinovics et al. (2025) presents a multilingual dataset designed for evaluating hallucinations in large language models using knowledge graphs.

referenceThe paper 'Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?' by Gekhman et al. (2024) examines the relationship between fine-tuning on new knowledge and hallucination rates.

referenceThe paper 'The Law of Knowledge Overshadowing: Towards Understanding, Predicting, and Preventing LLM Hallucination' by Zhang et al. (2025) explores the phenomenon of knowledge overshadowing in relation to LLM hallucinations.

referenceThe paper "Cognitive Mirage: A Review of Hallucinations in Large Language Models" by Ye et al. (2023) reviews the phenomenon of hallucinations in large language models.

Combining Knowledge Graphs and Large Language Models - arXiv arxiv.org arXiv 4 facts

claimLarge Language Models tend to generate inaccurate or nonsensical information, known as hallucinations, and often lack interpretability in their decision-making processes.

claimLarge language models (LLMs) exhibit limitations such as hallucinations and a lack of domain-specific knowledge, which can negatively impact their performance in real-world tasks.

claimIncorporating knowledge graphs into large language models can mitigate issues like hallucinations and lack of domain-specific knowledge because knowledge graphs organize information in structured formats that capture relationships between entities.

claimUsing large language models to automate the construction of knowledge graphs carries the risk of hallucination or the production of incorrect results, which compromises the accuracy and validity of the knowledge graph data.

A framework to assess clinical safety and hallucination rates of LLMs ... nature.com Nature 4 facts

referenceFarquhar et al. (2024) proposed using semantic entropy as a method for detecting hallucinations in large language models, published in Nature.

claimRecent research has established that hallucination may be an intrinsic, theoretical property of all large language models.

claimThe study on LLM clinical note generation supports the theory that hallucinations and omissions may be intrinsic theoretical properties of current Large Language Models.

referenceHuang, L. et al. authored 'A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions', published in 2024 (arXiv:2311.05232).

Building Trustworthy NeuroSymbolic AI Systems - arXiv arxiv.org arXiv 4 facts

claimZhang et al. (2023) identified reliability in LLMs by examining tendencies regarding hallucination, truthfulness, factuality, honesty, calibration, robustness, and interpretability.

claimLarge Language Models struggle to establish connections between symptoms like 'sleep deprivation' and 'drowsiness' with 'hallucinations' in conversational scenarios.

claimWhen prompted to include information about 'Xanax', Large Language Models often apologize and attempt to correct their responses, but these corrections frequently lack essential information, such as the various types of hallucinations associated with the drug.

referenceZhang et al. (2023) authored the paper titled 'Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models', published as arXiv:2309.01219.

A survey on augmenting knowledge graphs (KGs) with large ... link.springer.com Springer 3 facts

claimAlignment tuning and tool utilization can help alleviate the issue of hallucination in Large Language Models.

claimThe use of semantic layers in LLMs improves model interpretability by providing structured context, which reduces hallucinations and enhances the reliability of model responses.

referenceAgrawal G, Kumarage T, Alghami Z, and Liu H authored the survey 'Can knowledge graphs reduce hallucinations in llms?: A survey', published as an arXiv preprint in 2022 (arXiv:2311.07914).

Hallucinations in LLMs: Can You Even Measure the Problem? linkedin.com Sewak, Ph.D. · LinkedIn 3 facts

claimLarge Language Models (LLMs) generate responses based on probabilities derived from their training data, and hallucinations emerge when this training data is noisy, sparse, or contradictory.

claimAttention matrix analysis evaluates hallucination in Large Language Models by checking if the attention patterns used to determine input importance are logical.

claimHallucinations in Large Language Models (LLMs) occur when models generate content that is not grounded in reality or the input provided, such as fabricating facts, inventing relationships, or concocting non-existent information.

Reducing hallucinations in large language models with custom ... aws.amazon.com Amazon Web Services 3 facts

claimHallucinations in LLMs arise from the inherent limitations of the language modeling approach, which prioritizes the generation of fluent and contextually appropriate text without ensuring factual accuracy.

claimHallucinations in large language models (LLMs) are defined as outputs that are plausible but factually incorrect or made-up.

claimUnchecked hallucinations in LLMs can undermine system reliability and trustworthiness, leading to potential harm or legal liabilities in domains such as healthcare, finance, or legal applications.

A Survey on the Theory and Mechanism of Large Language Models arxiv.org arXiv 3 facts

claimLarge Language Models exhibit emergent phenomena not found in smaller models, including hallucination, in-context learning (ICL), scaling laws, and sudden 'aha moments' during training.

claimThe research paper 'Why and how llms hallucinate: connecting the dots with subsequence associations' (arXiv:2504.12691) investigates the causes of hallucinations in large language models by analyzing subsequence associations.

claim(2025b) identified three types of uncertainty in Large Language Models: document scarcity, limited capability, and query ambiguity, noting that current models struggle to identify the root cause of these uncertainties, which contributes to hallucination.

Enterprise AI Requires the Fusion of LLM and Knowledge Graph linkedin.com Jacob Seric · LinkedIn 3 facts

claimAdvarra identifies hallucination, prompt sensitivity, and limited explainability as unique risks associated with the use of Large Language Models (LLMs) that require governance and oversight to promote safety and confidence in the industry.

claimLarge language models (LLMs) present unique risks including hallucination, prompt sensitivity, and limited explainability, which require governance and oversight.

claimLarge language models (LLMs) require grounding in reality to provide mission-critical insights without hallucinations at scale.

Integrating Knowledge Graphs into RAG-Based LLMs to Improve ... thesis.unipd.it Università degli Studi di Padova 3 facts

claimLarge Language Models (LLMs) have a tendency to produce inaccurate or unsupported information, a problem known as 'hallucination'.

claimLarge Language Models (LLMs) frequently produce inaccurate or unsupported information, a phenomenon commonly referred to as 'hallucination'.

claimLarge Language Models (LLMs) have a tendency to produce inaccurate or unsupported information, a problem known as hallucination.

Empowering RAG Using Knowledge Graphs: KG+RAG = G-RAG neurons-lab.com Neurons Lab 2 facts

claimLarge language models face a challenge known as hallucination, where the model generates plausible but incorrect or nonsensical information.

claimKnowledge Graphs help mitigate the hallucination problem in LLMs by enabling the extraction and presentation of precise factual information, such as specific contact details, which are difficult to retrieve through standard LLMs.

LLM Hallucinations: Causes, Consequences, Prevention - LLMs llmmodels.org llmmodels.org 2 facts

claimLarge Language Models (LLMs) are AI systems capable of generating human-like text, but they are susceptible to producing outputs that lack factual accuracy or coherence, a phenomenon known as hallucinations.

claimStrategies to mitigate hallucinations in large language models include using high-quality training data, employing contrastive learning, implementing human oversight, and utilizing uncertainty estimation.

Unknown source 2 facts

claimEvaluating hallucination in large language models is a complex task.

claimLarge language models can produce hallucinations even when provided with well-organized prompts.

A Knowledge Graph-Based Hallucination Benchmark for Evaluating ... aclanthology.org Alex Robertson, Huizhi Liang, Mahbub Gani, Rohit Kumar, Srijith Rajamohan · Association for Computational Linguistics 2 facts

claimExisting benchmarks for evaluating Large Language Models are limited by static and narrow questions, which leads to limited coverage and misleading evaluations.

claimLarge Language Models possess a capacity to generate persuasive and intelligible language, but coherence does not equate to truthfulness, as responses often contain subtle hallucinations.

EdinburghNLP/awesome-hallucination-detection - GitHub github.com GitHub 2 facts

claimIntegrative grounding is a task requiring Large Language Models to retrieve and verify multiple interdependent pieces of evidence for complex queries, which often results in the model hallucinating rationalizations using internal knowledge when external information is incomplete.

claimHaluEval is a collection of generated and human-annotated hallucinated samples used for evaluating the performance of large language models in recognizing hallucinations.

Medical Hallucination in Foundation Models and Their Impact on ... medrxiv.org medRxiv 2 facts

claimThe hallucination of patient information by LLMs is similar to physician confirmation bias, where contradictory symptoms are overlooked, leading to inappropriate diagnosis and treatment.

claimHallucinations in Large Language Models (LLMs) are documented across multiple domains, including finance, legal, code generation, and education.

LLM-Powered Knowledge Graphs for Enterprise Intelligence and ... arxiv.org arXiv 1 fact

claimIntegrating large language models and knowledge graphs in enterprise contexts faces four key challenges: hallucination of inaccurate facts or relationships, data privacy and security concerns, computational overhead of running extraction at scale, and ontology mismatch when merging different knowledge sources.

A self-correcting Agentic Graph RAG for clinical decision support in ... pmc.ncbi.nlm.nih.gov PMC 1 fact

claimRetrieval-Augmented Generation (RAG) is a method used to make Large Language Models less prone to hallucinating by grounding their output in retrieved data.

Hallucination is still one of the biggest blockers for LLM adoption. At ... facebook.com Datadog 1 fact

claimHallucination is considered one of the primary obstacles preventing the widespread adoption of Large Language Models.

Enhancing LLMs with Knowledge Graphs: A Case Study - LinkedIn linkedin.com LinkedIn 1 fact

claimIntegrating Large Language Models with enterprise data and domain-specific knowledge reduces the risk of hallucination in the model's output.

What Really Causes Hallucinations in LLMs? - AI Exploration Journey aiexpjourney.substack.com AI Innovations and Insights 1 fact

claimHallucinations in large language models are defined as false but plausible-sounding responses generated by the model.

Enterprise AI Requires the Fusion of LLM and Knowledge Graph stardog.com Stardog 1 fact

claimUsing domain-specific ontologies as Parameter-Efficient Fine-Tuning (PEFT) input for Large Language Models improves accuracy and reduces the frequency of hallucinations.

LLM Hallucination Detection and Mitigation: State of the Art in 2026 zylos.ai Zylos 1 fact

claimComplete elimination of hallucinations in LLMs is currently limited because hallucinations are tied to the model's creativity, and total elimination would compromise useful generation capabilities.

RAG Hallucinations: Retrieval Success ≠ Generation Accuracy linkedin.com Sumit Umbardand · LinkedIn 1 fact

claimLarge Language Models generate confident answers even when retrieved context is irrelevant, which introduces hallucinations into production RAG systems.

Detecting and Evaluating Medical Hallucinations in Large Vision ... arxiv.org arXiv 1 fact

claimLarge Vision Language Models (LVLMs) inherit susceptibility to hallucinations from Large Language Models (LLMs), which poses significant risks in high-stakes medical contexts.

Daily Papers - Hugging Face huggingface.co Hugging Face 1 fact

claimLarge language models often struggle with hallucination problems, particularly in scenarios that require deep and responsible reasoning.

KGHaluBench: A Knowledge Graph-Based Hallucination ... researchgate.net ResearchGate 1 fact

claimKGHaluBench is a Knowledge Graph-based hallucination benchmark designed to evaluate Large Language Models.

Large Language Models Meet Knowledge Graphs for Question ... arxiv.org arXiv 1 fact

claimLeveraging Knowledge Graphs to augment Large Language Models can help overcome challenges such as hallucinations, limited reasoning capabilities, and knowledge conflicts in complex Question Answering scenarios.

LLM Knowledge Graph: Merging AI with Structured Data - PuppyGraph puppygraph.com PuppyGraph 1 fact

claimLarge Language Models (LLMs) possess significant capabilities in language generation and synthesis but suffer from factual inaccuracy (hallucination) and a lack of transparency when relying solely on their internal knowledge base.

[2509.04664] Why Language Models Hallucinate - arXiv arxiv.org arXiv 1 fact

claimLarge language models hallucinate because current training and evaluation procedures reward guessing over acknowledging uncertainty.

Benchmarking Hallucination Detection Methods in RAG - Cleanlab cleanlab.ai Cleanlab 1 fact

claimLarge Language Models (LLMs) are prone to hallucination because they are fundamentally brittle machine learning models that may fail to generate accurate responses even when the retrieved context contains the correct answer, particularly when reasoning across different facts is required.

MedHallu: Benchmark for Medical LLM Hallucination Detection emergentmind.com Emergent Mind 1 fact

claimSemantically similar hallucinations that are near the truth are the hardest for LLMs to detect.

Beyond the Black Box: How Knowledge Graphs Make LLMs Smarter ... medium.com Vi Ha · Medium 1 fact

claimThe combination of Large Language Models (LLMs) and Knowledge Graphs (KGs) can be utilized to reduce hallucinations in artificial intelligence applications.

A knowledge-graph based LLM hallucination evaluation framework amazon.science Amazon Science 1 fact

claimThe GraphEval framework identifies hallucinations in Large Language Models by utilizing Knowledge Graph structures to represent information.

New tool, dataset help detect hallucinations in large language models amazon.science Amazon Science 1 fact

claimLarge language models have a tendency to hallucinate, which is defined as making assertions that sound plausible but are factually inaccurate.

KG-IRAG: A Knowledge Graph-Based Iterative Retrieval-Augmented ... arxiv.org arXiv 1 fact

claimHallucination in Large Language Models (LLMs) is defined as content generated by the model that is not present in the retrieved ground truth, as cited in Ji et al. (2023), Li et al. (2024), and Perković et al. (2024).

Applying Large Language Models in Knowledge Graph-based ... arxiv.org Benedikt Reitemeyer, Hans-Georg Fill · arXiv 1 fact

claimLuo et al. argue that Large Language Models are skilled at reasoning in complex tasks but struggle with up-to-date knowledge and hallucinations, which negatively impact performance and trustworthiness.

Phare LLM Benchmark: an analysis of hallucination in ... giskard.ai Giskard 1 fact

claimHallucination in large language models is deceptive because responses that sound authoritative can mislead users who lack the expertise to identify factual errors.

How Enterprise AI, powered by Knowledge Graphs, is ... blog.metaphacts.com metaphacts 1 fact

claimIn an enterprise context, hallucinations in large language models represent an unacceptable operational and legal risk because business decisions can affect millions in revenue.

Are you hallucinated? Insights into large language models sciencedirect.com ScienceDirect 1 fact

claimHallucinations in large language models are the logical consequence of the transformer architecture's essential mathematical operation, known as the self-attention mechanism.

LLM-KG4QA: Large Language Models and Knowledge Graphs for ... github.com GitHub 1 fact

referenceThe paper titled 'Knowledge Graphs, Large Language Models, and Hallucinations: An NLP Perspective' was published in Journal of Web Semantics in 2025.

KG-RAG: Bridging the Gap Between Knowledge and Creativity - arXiv arxiv.org arXiv 1 fact

claimLarge Language Models are prone to generating factually incorrect information ('hallucinations'), struggle with processing extended contexts, and suffer from catastrophic forgetting, where previously learned knowledge is lost during new training.

Neuro-symbolic AI - Wikipedia en.wikipedia.org Wikipedia 1 fact

claimIn 2025, the adoption of neuro-symbolic AI increased as a response to the need to address hallucination issues in large language models.

Detect hallucinations in your RAG LLM applications with Datadog ... datadoghq.com Barry Eom, Aritra Biswas · Datadog 1 fact

claimHallucinations in large language models occur when the model confidently generates information that is false or unsupported by the provided data.

Construction of intelligent decision support systems through ... - Nature nature.com Nature 1 fact

claimLarge language models deployed in business settings face significant limitations, including hallucinating information, struggling with domain expertise, and failing to justify their reasoning.

A Knowledge-Graph Based LLM Hallucination Evaluation Framework researchgate.net ResearchGate 1 fact

claimLarge Language Models (LLMs) generate responses that can contain inconsistencies, which are referred to as hallucinations.

The Synergy of Symbolic and Connectionist AI in LLM-Empowered ... arxiv.org arXiv 1 fact

claimLarge Language Models face 'hallucination' challenges, defined as the production of false or nonsensical information that appears convincing but is inaccurate or not based on reality.