Main article

Daniel Hartley
School of Information Technology, Deakin University, Geelong VIC 3220, Australia
Mei-Ling Tran
Department of Business Information Systems, RMIT University, Melbourne VIC 3000, Australia
Samuel Oduya*
Department of Supply Chain and Logistics Management, University of Southern Queensland, Toowoomba QLD 4350, Australia
samuel.oduya@usq.edu.au

DOI: https://doi.org/10.63646/datamind.2025.030405

Abstract

Geopolitical crises, extreme weather events, port congestion surges, and cargo flight disruptions propagate through global supply chains in ways that are observable days to weeks before their downstream impact manifests as stockouts, delivery delays, or revenue losses. Yet no unified, open database currently links unstructured news event streams to structured logistics records—port dwell times, air cargo delays, purchase-order completion rates, and inventory snapshots—within a single schema that supports reproducible forecasting research. This paper introduces SupplyDisruptDB, a relational database system that integrates six data streams into a coherent risk-oriented schema: NewsEvent, PortStatus, FlightCargo, PurchaseOrder, InventorySnapshot, and RiskScore. A five-stage ingestion pipeline applies named-entity recognition, sentiment scoring, and geospatial event parsing to news corpora, fuses the extracted signals with AIS-derived port congestion indices and OAG flight delay records, enforces structured quality-control validation, and computes per-SKU supply-chain risk scores using an LSTM-based forecasting model. Validated on a 36-month corpus spanning 187,400 news events, 14,280 port status records across 38 major seaports, 312,600 air cargo flight segments, 94,700 purchase-order records, and 61,200 inventory snapshots across 12 industry verticals, SupplyDisruptDB enables delivery-delay prediction with a mean absolute error of 1.84 days (LSTM, full database), stockout-event F1 of 0.871, and a risk lead time of 8.4 days—the advance warning horizon before a disruption event reaches critical inventory threshold. The database is released as open-source software under Apache 2.0 with a documented REST and GraphQL API, a Python client library, and reproducible experiment notebooks, providing a reusable foundation for supply-chain resilience research and operational risk intelligence.

Article details

How to Cite

Hartley, D., Tran, M.-L. ., & Oduya, S. (2025). Supply-Chain Disruption Forecasting from News and Logistics Databases. DATAMIND, 3(4), 71-85. https://doi.org/10.63646/datamind.2025.030405