GraphRAG vs. VectorRAG: An Empirical Comparison of Retrieval Architectures for Knowledge-Intensive Tasks
Main article
Abstract
Retrieval-augmented generation (RAG) has become a standard approach for grounding large language model outputs in factual, updateable knowledge bases. Two dominant retrieval paradigms have emerged: vector-based retrieval (VectorRAG), which uses dense embedding similarity to identify relevant passages, and graph-based retrieval (GraphRAG), which traverses explicit knowledge graph structures to surface related entities and relationships. Despite both approaches seeing wide adoption, rigorous head-to-head comparisons under controlled conditions remain scarce. This paper presents an empirical comparison of VectorRAG and GraphRAG across four knowledge-intensive task families — single-hop QA, multi-hop QA, entity disambiguation, and temporal reasoning — using three corpora of varying knowledge graph density. We evaluate five specific implementations: naive RAG with FAISS, HyDE-augmented retrieval, Microsoft's GraphRAG, NebulaGraph RAG, and a hybrid architecture combining both paradigms. Results show that GraphRAG outperforms VectorRAG by 12–23 percentage points on multi-hop and temporal tasks but is competitive only with VectorRAG on single-hop factoid tasks while being substantially more expensive to construct and maintain. The hybrid architecture achieves the best overall performance across all task types at intermediate cost. We release all evaluation code and experimental logs to facilitate replication.
