EdgeDB-FL: An Edge Database for Federated Learning Updates and Model-State Management
Main article
Abstract
Federated learning deployments on heterogeneous edge devices generate a continuous stream of compressed model updates, differential privacy budgets, training metrics, checkpoint states, and synchronisation logs that collectively define the provenance and reproducibility of every trained model. Yet no purpose-built database exists that structures, versions, and exposes this edge FL state in a machine-queryable and audit-ready form. This paper introduces EdgeDB-FL, a lightweight relational database system specifically designed to manage the full lifecycle of federated learning state on resource-constrained edge infrastructure. EdgeDB-FL organises data into six core tables—EdgeDevice, ClientUpdate, ModelVersion, TrainingRound, PrivacyBudget, and SyncLog—linked by foreign-key constraints and supported by seven composite indexes optimised for the access patterns of FL orchestration, privacy accounting, and convergence analysis workloads. A five-stage processing pipeline ingests on-device training events, compresses gradient updates using top-k sparsification and 8-bit quantisation before database storage, tracks differential privacy epsilon consumption per device, validates updates against server-side model hashes, and exports versioned model artefacts to downstream lakehouse and vector store targets. Experiments across a 200-device benchmark network demonstrate a 77.3% reduction in per-round per-device communication payload relative to standard FL baselines (4.5 KB vs. 19.8 KB), a global test accuracy of 94.6% with differential privacy (ε = 2.0) after 50 rounds, and a median database query latency of 17 ms under 500-device concurrent load. EdgeDB-FL is released as open-source software under MIT licence with a Python SDK, REST and GraphQL APIs, and reproducible experiment notebooks.
