A Data-Centric Architecture for Insider Threat Indicators: From Binary Risk Instances to Governed Security Knowledge Bases
Main article
Abstract
Insider threat prevention requires more than event detection after a violation has already occurred. Organizations need a governed data architecture that converts scattered behavioral, organizational, and technical observations into interpretable early-warning indicators before misuse becomes an incident. This article develops a data-centric architecture for insider threat indicators that transforms binary risk instances into a governed security knowledge base. The proposed framework includes six layers: indicator vocabulary design, binary evidence capture, provenance-aware storage, data quality validation, entropy-informed scoring, and governance-oriented knowledge services. A synthetic enterprise dataset is used to demonstrate how the architecture supports indicator traceability, quality control, risk score computation, and decision documentation. Results show that governance controls improve average indicator quality from 0.66 to 0.89, reduce missing high-criticality evidence by 58%, and identify risk concentration in work-context and data-protection indicators before a simulated breach pathway becomes operationally visible. The study contributes to data-driven AI and computational discovery by reframing insider threat measurement as a knowledge-base construction problem rather than a single predictive modeling task. The framework offers practical guidance for security teams, data stewards, and enterprise risk managers seeking transparent, auditable, and reusable insider-risk analytics.
