CarbonLedgerDB: A Structured Database for Supply-Chain Carbon Accounting and Green AI Analytics
Main article
Abstract
Carbon accounting across global supply chains demands unprecedented data infrastructure: procurement records, logistics events, emission factors, and product-batch genealogies must be integrated and auditable at scale. Existing public databases address isolated aspects of this challenge but lack an integrated, versioned, and AI-ready architecture for end-to-end carbon lifecycle accounting. This paper introduces CarbonLedgerDB, a relational database system designed to support supply-chain carbon accounting and Green AI analytics. CarbonLedgerDB organizes five core entity types—Supplier, ProductBatch, EmissionFactor, LogisticsEvent, and CarbonRecord—into a formally specified schema with foreign-key traceability across Scope 1, 2, and 3 emission categories. The database ingests data from multiple heterogeneous sources including enterprise resource planning feeds, IoT sensor streams, logistics APIs, and curated emission-factor libraries. A structured quality-control layer manages missing values, duplicates, unit harmonization, and audit versioning. Experiments on a sample of 446 supplier entities, 12,800 product batches, and 31,200 carbon records demonstrate a mean carbon re-computation error of 2.3%, emission-factor coverage of 91.4%, three-tier supplier traceability of 87.6%, and audit consistency of 96.2%. CarbonLedgerDB is made openly available with a documented API and reproducible experiment notebooks, providing a reusable infrastructure for regulatory compliance analytics, machine learning model training, and organizational carbon intelligence.
