concept

GraphRAG-Filtering

Facts (13)

Sources

Empowering GraphRAG with Knowledge Filtering and Integration arxiv.org arXiv Mar 18, 2025 13 facts

procedureThe GraphRAG-Filtering Stage 1 (Coarse Filtering) retains only those retrieved paths whose attention scores exceed a predefined threshold τ.

procedureThe GraphRAG-FI (Filtering & Integration) framework consists of two components: GraphRAG-Filtering, which uses a two-stage filtering mechanism to refine retrieved information, and GraphRAG-Integration, which uses a logits-based selection strategy to balance external knowledge with the large language model's intrinsic reasoning.

referenceThe GraphRAG-FI framework consists of two core components: GraphRAG-Filtering, which removes irrelevant or misleading retrieved knowledge, and GraphRAG-Integration, which balances retrieved knowledge with the LLM's inherent reasoning ability to prevent the overuse of retrieved information.

measurementThe GraphRAG-Filtering component improves the ROG retriever's performance, increasing the F1 score by 4.19% and the Hit score by 3.15% on the CWQ dataset.

procedureThe GraphRAG-Filtering prompt construction process incorporates selected paths and the query into the prompt by delineating them into 'High Priority Paths' (the final filtered paths) and 'Additional Paths' (paths that passed the coarse filter but were removed by the fine filter).

claimThe GraphRAG-Filtering framework conjectures that 'Additional Paths' (paths removed by the fine filter) offer useful supplementary context, even if they are less important than 'High Priority Paths'.

claimThe GraphRAG-Filtering framework integrates LLM internal knowledge by attempting to fuse answers from both the standalone LLM and the GraphRAG method.

formulaThe GraphRAG-Filtering method utilizes a two-stage filtering process to select retrieved paths or triplets for LLM prompts, where the set of retrieved paths is denoted as P and each path is assigned an attention score α.

formulaIn GraphRAG-Filtering, the final set of filtered paths is determined by the formula P_final = {p ∈ P_coarse | s(p) ≥ τ_LLM}, where P_coarse is the set of paths passing coarse filtering, s(p) is the LLM evaluation score, and τ_LLM is a threshold determined by the LLM itself.

procedureThe GraphRAG-Filtering framework uses LLM logits to determine the relevance of answers produced by both the standalone LLM and the GraphRAG method, focusing on answers with higher confidence scores.

claimThe two-stage GraphRAG-Filtering approach enhances the quality of retrieved paths and reduces computational cost by limiting LLM usage to only those paths deemed promising in the first stage.

procedureGraphRAG-FI utilizes a two-stage filtering mechanism called GraphRAG-Filtering to refine retrieved information and a logits-based selection strategy called GraphRAG-Integration to balance retrieval and intrinsic reasoning.

procedureThe GraphRAG-Filtering Stage 2 (Fine Filtering) performs a precise evaluation using an LLM on the subset of paths that passed the coarse filtering stage, further reducing the number of candidate paths.