Database-Oriented AI Safety: Storing, Indexing, and Mining Chain-of-Thought Traces for Deception Detection
Main article
Abstract
As large language models (LLMs) are increasingly deployed in edge environments, deceptive alignment—wherein models produce internally unsafe chain-of-thought (CoT) reasoning while presenting safe outputs—has emerged as a critical AI safety challenge. Existing detection frameworks treat CoT traces as transient inference artifacts rather than persistent, queryable data assets, thereby missing the cumulative forensic evidence embedded across millions of reasoning episodes. This paper introduces DATAMIND-CoT, a database-oriented architecture that reframes CoT safety monitoring as a data management problem. The proposed framework systematically stores CoT traces in a heterogeneous three-store system comprising a relational store for structured metadata, a vector store for 128-dimensional contrastive embeddings, and a graph store for reasoning directed acyclic graphs (DAGs). Purpose-designed indexing schemes—including B+ trees on entropy scores, HNSW indexes on embedding vectors, and GIN-based inverted indexes on token sequences—enable sub-millisecond retrieval across corpora of up to 100,000 traces. A hybrid mining pipeline combines sequential pattern mining on tokenized reasoning chains, density-based clustering on the embedding manifold, and DAG traversal for structural deception signatures. Extensive evaluation on DeceptionBench (180 adversarial scenarios, five deception taxonomies) demonstrates that database-backed mining reduces the Deception Tendency Rate (DTR) to 36.96% on Gemma-3-4B-IT, matching state-of-the-art contrastive learning while adding persistent auditability, multi-model federation, and cross-session deception trend analysis. The overhead is minimal: 2.8% additional inference latency and 12 MB of RAM on an NVIDIA Jetson Orin Nano 8 GB edge device. DATAMIND-CoT establishes a new research paradigm in which the database, not merely the neural monitor, becomes a first-class citizen of AI safety infrastructure.
