AutoKnow
Facts (10)
Sources
Construction of Knowledge Graphs: State and Challenges - arXiv arxiv.org 10 facts
claimThe pipeline or toolset implementations for NELL and AI-KG, as well as five of the seven toolsets analyzed (including Amazon's AutoKnow and Apple's SAGA), are closed-source.
referenceThe AutoKnow architecture consists of an ontology suite for taxonomy enrichment and relation discovery, and a data suite for processing input data.
referenceX. Dong et al. presented 'AutoKnow', a system for self-driving knowledge collection for thousands of product types, at the 2020 ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.
procedureThe AutoKnow system extracts relations using classification models for attribute applicability and a regression model for attribute importance, applied to product profiles and user search, review, or Q&A data.
procedureThe AutoKnow system performs data imputation by extracting attribute-value pairs from product data using a taxonomy-aware tagging approach that leverages Conditional Random Fields (CRF) combined with multi-task learning and a shared BiLSTM to train sequence tagging and product type categorization.
referenceAutoKnow is a closed-source system used by Amazon to create a retail product knowledge graph by processing product catalogs and consumer shopping behavior logs using machine learning and distant supervision.
procedureThe AutoKnow system performs taxonomy enrichment by extracting new types from input product catalogs and customer queries, then applying a Graph Neural Network (GNN) approach to place these new types into an existing ontology.
claimThe AutoKnow system derives most of its training and validation data from product catalogs or customer behavior logs by applying distant supervision and, in some cases, utilizing crowdsourcing via Amazon Mechanical Turk.
procedureThe AutoKnow system cleans data by checking extracted attribute-value pairs for correctness using a transformer-based neural network model, and finds synonyms using a supervised approach that combines collaborative filtering with a logistic regression model.
measurementIn an experimental execution, the AutoKnow system constructed a product graph at Amazon containing more than 30 million entities and 1 billion relations, assigned to 19,000 entity types and 1,000 relation types.