Main article

Jason K. Lim
School of Computer Science and Engineering, Nanyang Technological University, 50 Nanyang Avenue, Singapore 639798, Singapore
Chloe S. Wong*
Information Systems Technology and Design Pillar, Singapore University of Technology and Design, 8 Somapah Road, Singapore 487372, Singapore
chloe.wong@sutd.edu.sg

DOI: https://doi.org/10.63646/datamind.2024.020106

Abstract

Retrieval-augmented generation (RAG) systems increasingly issue hybrid queries that combine approximate nearest-neighbor (ANN) search over learned embeddings with selective relational predicates over structured attributes. Although individual vector indexes and relational operators are each well studied, the engineering problem of executing them together under a single latency budget remains poorly characterized, and existing ANN benchmarks ignore attribute filtering altogether. We present VR-Bench, a reproducible benchmark for vector-relational hybrid queries. VR-Bench isolates a concrete and consequential engineering failure that we call the selectivity valley: pre-filtering wins when predicates are highly selective, post-filtering wins when they are permissive, and both degrade sharply in the intermediate selectivity band that dominates real workloads. We formalize the hybrid query, specify a system architecture and relational schema, and define five workload classes, five datasets spanning 8M to 100M vectors, and a metric suite covering recall, throughput, tail latency, build cost, and memory. We evaluate four execution strategies (pre-filter, post-filter, block/filtered-graph, and a cost-based router) across five production systems. The router recovers 92 to 98 percent of the best achievable throughput across the full selectivity range while holding recall at the 0.95 target, whereas post-filtering loses up to 27 recall points on highly selective queries. Every result ships with pinned containers, seeded workloads, an oracle, a data dictionary, and a public artifact, so that each figure in this paper can be regenerated end to end with a single command.

Article details

How to Cite

K. Lim, J. ., & S. Wong, C. (2024). A Reproducible Benchmark for Vector-Relational Hybrid Queries in RAG Databases. DATAMIND, 2(1), 80-99. https://doi.org/10.63646/datamind.2024.020106