concept

entity resolution

Also known as: ER, deduplication, entity matching, link discovery

Facts (56)

Sources

Construction of Knowledge Graphs: State and Challenges - arXiv arxiv.org arXiv 51 facts

referenceE. Ioannou, E. Thanos, and T. Palpanas authored 'The Four Generations of Entity Resolution', published by Morgan & Claypool Publishers in 2021 as part of the Synthesis Lectures on Data Management.

claimThe blocking phase in entity resolution aims to drastically reduce the number of entity pairs to evaluate by partitioning data so that only entities within the same partition are compared.

referenceW. M., S. A., and D. C. authored 'Fast and accurate incremental entity resolution relative to an entity knowledge base', presented at the CIKM conference in 2012.

claimProperty matching is essential for entity resolution and entity fusion, as it helps determine matching entities based on property similarity and allows for the combination of equivalent properties to avoid redundancy.

procedureThe entity resolution process for the unnamed artist-focused knowledge graph uses artist name and birthdate similarities and MinHash/LSH blocking to ensure scalability.

claimStreaming-like data ingestion into a knowledge graph requires support for dynamic, real-time matching of new entities with existing knowledge graph entities.

claimThe HKGB knowledge graph construction solution provides a description of its entity resolution (ER) process that is too vague to allow for a definitive assessment of its capabilities.

claimInvestigating the application of entity resolution techniques designed for dirty data sources to entity linking tasks represents a potential research opportunity.

claimThe entity linking component of knowledge extraction can render an additional entity resolution step unnecessary in knowledge graph construction.

claimWhile NLP tasks have benefited from reusable implementations like Stanford CoreNLP, other knowledge graph construction tasks, such as entity resolution, currently lack similar modular, reusable implementations.

claimEntity fusion is the process of combining matching entities to enrich information about an entity in a uniform way, following the entity resolution step.

referenceT. Brasileiro Araújo, K. Stefanidis, C.E. Santos Pires, J. Nummenmaa, and T. Pereira da Nóbrega authored 'Incremental blocking for entity resolution over web streaming data', presented at the IEEE/WIC/ACM International Conference on Web Intelligence in 2019.

claimBenchmark datasets exist for specific subtasks of knowledge graph construction, such as entity resolution (e.g., Gollum) and knowledge completion (e.g., CoDEx).

claimBlocking for incremental or streaming Entity Resolution requires identifying a subset of existing Knowledge Graph entities for matching to ensure efficiency, as Knowledge Graphs are typically large and growing.

claimBlocking and entity resolution processes should be limited to entities of the same or similar types, often using attribute-based blocking keys like birth year or manufacturer.

claimEntity linking and entity resolution are sometimes collectively referred to as 'entity canonicalization' because both processes aim to connect the same entities within and across data sources.

claimKnowledge graph-specific approaches have limitations regarding scalability to many sources, support for incremental updates, metadata management, ontology management, entity resolution and fusion, and quality assurance.

claimKnowledge graph construction requires incremental approaches that can build upon previous match decisions to determine if new entities are already represented in the knowledge graph or should be added as new entities.

claimThe authors of the paper 'Construction of Knowledge Graphs: State and Challenges' focus their requirements analysis on the knowledge graph construction process and structured input data, such as entity resolution, rather than focusing solely on the knowledge graph itself.

claimEntity resolution is challenging due to the often limited quality and high heterogeneity of different entity descriptions.

procedureThe SAGA system performs deduplication by grouping entities by type and using simple blocking to partition data into smaller buckets, followed by a matching model that computes similarity scores using machine-learning or rule-based methods, and finally utilizing correlation clustering to determine matching entities.

claimRecent approaches to entity resolution for knowledge graphs utilize multi-source big data techniques, Deep Learning, or knowledge graph embeddings.

claimMost existing entity resolution approaches for knowledge graphs are designed for static or batch-like processing where matches are determined within or between datasets of a fixed size.

claimEntity resolution approaches often assume clean or deduplicated data sources, which limits the ability to match unmatched entities to already matched entities.

claimSemi-automatic ontology development tasks overlap significantly with methods used in knowledge extraction, entity resolution, quality assurance, and knowledge completion.

claimData integration and canonicalization in knowledge graphs involve entity linking, entity resolution, entity fusion, and the matching and merging of ontology concepts and properties.

claimEntity resolution is computationally expensive because the number of comparisons between entities typically grows quadratically with the total number of entities.

claimWhen external knowledge does not align during the SLOGERT knowledge graph integration process, the SILK framework is used for entity resolution.

claimSummarization techniques speed up computation in Entity Resolution by dividing large blocks into sub-blocks with representatives, allowing for a constant number of comparisons per new record.

procedureThe standard approach for entity resolution uses a pipeline of three succeeding phases: blocking, linking/matching, and clustering.

referenceB. Ramadan, P. Christen, H. Liang, and R.W. Gayler published 'Dynamic Sorted Neighborhood Indexing for Real-Time Entity Resolution' in the Journal of Data and Information Quality in 2015.

claimThe matching phase in entity resolution determines the similarity between pairs of entities, often resulting in a similarity graph where nodes represent entities and edges link similar pairs.

claimWhile entity resolution typically operates on semi-structured data, deep learning-based approaches have been developed to address entity resolution in unstructured data sources.

claimEntity resolution, also known as entity matching, deduplication, or link discovery, is the task of identifying entities in one or more sources that represent the same real-world object.

claimEntity linking in knowledge graphs is performed using various methods, including dictionary-based approaches relying on gathered synonyms in AI-KG, human interaction in XI, or entity resolution in HKGB.

claimEntity Resolution and Fusion is the process of identifying matching entities and merging them within a knowledge graph.

claimThe clustering phase in entity resolution is an optional step that uses the similarity graph to group together all matches, providing a more holistic perspective on entity similarities compared to myopic pairwise matching.

claimOpen knowledge graph-specific approaches currently face limitations in scalability to many sources, support for incremental updates, and several technical areas including metadata management, ontology management, entity resolution/fusion, and quality assurance.

claimSchema-agnostic blocking approaches for incremental or streaming Entity Resolution are a recent development compared to those for non-incremental Entity Resolution.

claimNeural methods for entity resolution in knowledge graphs have recently faced increased scrutiny following a period of significant hype.

claimKnowledge graph pipelines that employ entity resolution often use sophisticated methods such as blocking to address scalability issues, as seen in ArtistKG and SAGA, or machine-learning-based matchers, as seen in SAGA.

claimEntity resolution is supported by only a few knowledge graph construction approaches.

referenceL. Gazzarri and M. Herschel authored 'End-to-end Task Based Parallelization for Entity Resolution on Dynamic Data', presented at the 2021 IEEE 37th International Conference on Data Engineering (ICDE).

referenceThe paper 'Blocking and filtering techniques for entity resolution: A survey' by G. Papadakis, D. Skoutas, E. Thanos, and T. Palpanas was published in ACM Computing Surveys (CSUR), volume 53, issue 2.

claimThe SLOGERT knowledge graph construction solution suggests that entity resolution might be necessary in some cases, but recommends that this process be performed using an external tool.

claimPairwise matching for entity resolution is based on the combined similarity between two entities, which is derived from their property values or related entities.

referenceThe paper 'JedAI3: beyond batch, blocking-based Entity Resolution' by G. Papadakis, L. Tsekouras, E. Thanos, N. Pittaras, G. Simonini, D. Skoutas, P. Isaris, G. Giannakopoulos, T. Palpanas, and M. Koubarakis was published in the EDBT proceedings.

claimQuality improvement for knowledge graphs includes data cleaning, error correction, outlier detection, entity resolution, data fusion, and continuous ontology development.

referenceB. Ramadan and P. Christen authored 'Forest-Based Dynamic Sorted Neighborhood Indexing for Real-Time Entity Resolution', presented at the 23rd ACM International Conference on Information and Knowledge Management (CIKM '14) in 2014.

procedureDuplicate detection, schema matching, and entity resolution are techniques used to identify and resolve inconsistencies, redundancies, and format errors in knowledge graphs.

claimExisting benchmarks for knowledge graph construction are currently limited to individual tasks such as knowledge extraction, ontology matching, entity resolution, and knowledge graph completion.

KG-RAG: Bridging the Gap Between Knowledge and Creativity - arXiv arxiv.org arXiv May 20, 2024 2 facts

claimFuture research could improve the quality and reliability of the knowledge graphs used by CoE by integrating advanced methods such as entity resolution (Binette et al., 2022) and entity linking (Shen et al., 2021).

Unlock the Power of Knowledge Graphs and LLMs - TopQuadrant topquadrant.com Steve Hedden · TopQuadrant 1 fact

claimLarge language models enable faster knowledge graph creation and curation by performing entity resolution, automated tagging of unstructured data, and entity and class extraction.

Knowledge Graphs vs RAG: When to Use Each for AI in 2026 - Atlan atlan.com Atlan Feb 12, 2026 1 fact

claimKnowledge graph maintenance requires schema governance and entity resolution, whereas RAG system maintenance requires document refreshing and embedding updates.

LLM-empowered knowledge graph construction: A survey - arXiv arxiv.org arXiv Oct 23, 2025 1 fact

claimInstance-level fusion in knowledge graphs aims to reconcile heterogeneous or redundant entities through entity alignment, disambiguation, deduplication, and conflict resolution to maintain a coherent and semantically precise graph.