concept

data quality

Also known as: data quality standards

Facts (19)

Sources
Construction of Knowledge Graphs: State and Challenges - arXiv arxiv.org arXiv 6 facts
claimRyen et al. found that data quality appears to be a major blind spot in knowledge graph creation approaches.
referenceR.Y. Wang and D.M. Strong defined data quality from the perspective of data consumers in their 1996 paper, 'Beyond Accuracy: What Data Quality Means to Data Consumers', published in the Journal of Management Information Systems.
referenceRyen et al. conducted a systematic literature review on knowledge graph creation approaches based on Semantic Web technologies, surveying 36 approaches across five construction steps: ontology development, data preprocessing, data integration, quality and refinement, and data publication.
claimData quality problems should be addressed during the import process to prevent the ingestion of low-quality or incorrect data into a knowledge graph.
claimQuality assurance in knowledge graph construction is a cross-cutting topic that addresses ontological consistency, data quality of entities and relations (comprehensiveness), and domain coverage.
claimThere is a need for more comprehensive data quality measures and repair strategies that minimize human intervention to retain scalability in knowledge graph construction.
What are the challenges in maintaining a knowledge graph? - Milvus milvus.io Milvus 3 facts
claimMaintaining a knowledge graph requires addressing a multifaceted set of challenges, specifically data quality, scalability, semantic complexity, and security.
claimOrganizations can harness the full potential of their knowledge graphs to drive informed decision-making and innovation by understanding and proactively managing challenges related to data quality, scalability, semantic complexity, and security.
claimMaintaining data quality and consistency is a primary challenge in knowledge graph management because integrating data from multiple sources often leads to discrepancies, variations in data formats, missing information, or conflicting data points.
Medical Hallucination in Foundation Models and Their ... medrxiv.org medRxiv Mar 3, 2025 3 facts
claimData quality and curation practices influence hallucination rates in AI systems, particularly when generating patient summaries, according to a 2021 study.
referenceThe FDA published Good Machine Learning Practice (GMLP) guidance to address challenges in AI/ML-enabled medical devices, specifically covering data quality, algorithm validation, and performance monitoring.
claimEnhancing data quality and curation is critical for reducing hallucinations in AI models because inaccuracies or inconsistencies in training data can propagate errors in model outputs.
On Hallucinations in Artificial Intelligence–Generated Content ... jnm.snmjournals.org The Journal of Nuclear Medicine 2 facts
claimEffective mitigation of AI hallucinations in Nuclear Medicine Imaging (NMI) requires a comprehensive approach that encompasses data quality, learning paradigms, and model design.
claimSystematic data cleaning during preprocessing can reduce inconsistencies and improve data fidelity to mitigate hallucinations, although defining objective criteria for data quality standards remains a complex challenge.
Why Do Large Language Models Hallucinate? | AWS Builder Center builder.aws.com AWS May 13, 2025 1 fact
claimLarge Language Model (LLM) hallucinations are caused by three primary factors: data quality issues, model training methodologies, and architectural limitations.
How Enterprise AI, powered by Knowledge Graphs, is ... blog.metaphacts.com metaphacts Oct 7, 2025 1 fact
measurementGartner estimates that poor data quality costs organizations at least $12.9 million per year.
Combining Knowledge Graphs With LLMs | Complete Guide - Atlan atlan.com Atlan Jan 28, 2026 1 fact
quoteJoe DosSantos, VP of Enterprise Data and Analytics at Workday, stated: "Atlan enabled us to easily activate metadata for everything from discovery in the marketplace to AI governance to data quality to an MCP server delivering context to AI models. All of the work that we did to get to a shared language amongst people at Workday can be leveraged by AI via Atlan's MCP server."
Practices, opportunities and challenges in the fusion of knowledge ... frontiersin.org Frontiers 1 fact
referenceThe book 'Improving Data Quality: Consistency and Accuracy' discusses methods and principles for enhancing data quality, specifically focusing on consistency and accuracy.
Measurement of diets that are healthy, environmentally sustainable ... frontiersin.org Frontiers 1 fact
claimPeer-reviewed publication constraints limit the ability of authors to elaborate on data quality, model integration methods, and indicator selection choices.