measurement
Category C, representing cases where the LLM-only model outperforms GraphRAG and GraphRAG leads to wrong predictions for queries the standalone LLM originally answered correctly, accounts for 16.89% of samples when evaluated via F1 score.

Authors

Sources

Referenced by nodes (1)