EdgeDB-FL: An Edge Database for Federated Learning Updates and Model-State Management

Jinhyuk  Kwon; Soojin  Park; Taeyoung  Lim

doi:10.63646/datamind.2025.030304

Open Access PDF

Published 2025-09-30

Jinhyuk Kwon

Department of Computer Science and Engineering, Dongguk University, Seoul 04620, Republic of Korea

Soojin Park

School of Software, Soongsil University, Seoul 06978, Republic of Korea

Taeyoung Lim*

Department of Electrical and Information Engineering, Seoul National University of Science and Technology, Seoul 01811, Republic of Korea
taeyoung.lim@seoultech.ac.kr

DOI: https://doi.org/10.63646/datamind.2025.030304

Abstract

Federated learning deployments on heterogeneous edge devices generate a continuous stream of compressed model updates, differential privacy budgets, training metrics, checkpoint states, and synchronisation logs that collectively define the provenance and reproducibility of every trained model. Yet no purpose-built database exists that structures, versions, and exposes this edge FL state in a machine-queryable and audit-ready form. This paper introduces EdgeDB-FL, a lightweight relational database system specifically designed to manage the full lifecycle of federated learning state on resource-constrained edge infrastructure. EdgeDB-FL organises data into six core tables—EdgeDevice, ClientUpdate, ModelVersion, TrainingRound, PrivacyBudget, and SyncLog—linked by foreign-key constraints and supported by seven composite indexes optimised for the access patterns of FL orchestration, privacy accounting, and convergence analysis workloads. A five-stage processing pipeline ingests on-device training events, compresses gradient updates using top-k sparsification and 8-bit quantisation before database storage, tracks differential privacy epsilon consumption per device, validates updates against server-side model hashes, and exports versioned model artefacts to downstream lakehouse and vector store targets. Experiments across a 200-device benchmark network demonstrate a 77.3% reduction in per-round per-device communication payload relative to standard FL baselines (4.5 KB vs. 19.8 KB), a global test accuracy of 94.6% with differential privacy (ε = 2.0) after 50 rounds, and a median database query latency of 17 ms under 500-device concurrent load. EdgeDB-FL is released as open-source software under MIT licence with a Python SDK, REST and GraphQL APIs, and reproducible experiment notebooks.

Keywords: Federated learning; edge database; differential privacy; model versioning; gradient compression; IoT; communication efficiency; reproducible AI

This work is licensed under a Creative Commons Attribution 4.0 International License.

How to Cite

Kwon, J., Park, S. ., & Lim, T. (2025). EdgeDB-FL: An Edge Database for Federated Learning Updates and Model-State Management. DATAMIND, 3(3), 45-59. https://doi.org/10.63646/datamind.2025.030304

Download Citation

Article sidebar

Main article

Abstract

Article details

How to Cite