Main article

Xingke Zhu
School of Cyberspace Security, Beijing University of Posts and Telecommunications, Beijing 100876, China
Zhiyong Zhang*
School of Cyberspace Security, Beijing University of Posts and Telecommunications, Beijing 100876, China
zyzhang@bupt.edu.cn
Haonan Wang
School of Cyberspace Security, Beijing University of Posts and Telecommunications, Beijing 100876, China
Bin Liu
School of Cyberspace Security, Beijing University of Posts and Telecommunications, Beijing 100876, China

Abstract

The convergence of information technology (IT) and operational technology (OT) in Industrial Internet of Things (IIoT) environments disrupts conventional security boundaries, exposing critical industrial infrastructure to sophisticated cyber-physical threats that single-layer defense methods cannot adequately address. Attacks targeting IIoT systems increasingly exploit the semantic gap between IT cybersecurity defenses and OT operational constraints, requiring defense frameworks that maintain real-time industrial availability while adaptively responding to evolving threat tactics. This paper proposes Situ-HMARL, a Situation-Based Hierarchical Multi-Agent Reinforcement Learning adaptive defense framework that integrates three-stage industrial situational awareness (SA) with a two-level hierarchical multi-agent reinforcement learning (HMARL) decision architecture. The three-stage SA architecture continuously fuses heterogeneous data from IT and OT layers into a structured global observation space through: (1) multi-source sensor data fusion and anomaly detection; (2) threat comprehension and severity scoring; and (3) situational projection and attack trajectory forecasting. On this basis, the two-level HMARL framework decouples strategic and tactical defense: a high-level agent (HLA) processes the global observation space and selects defense strategies (Block, Isolate, Alert, Monitor, Reroute) using a Proximal Policy Optimization policy; low-level agents (LLAs) deployed at each IIoT node perform local situational perception and execute defense policy in real time with sub-second response latency. Experimental evaluation on a simulated IIoT testbed with 100 nodes under six attack categories demonstrates that Situ-HMARL achieves 95.7% mean detection rate, 0.84 s mean defense response time, and 99.71% system availability, outperforming MARL-without-SA (93.8% detection, 1.24 s), single-agent DRL (90.2% detection, 1.87 s), and rule-based IDS baselines (86.4% detection, 4.32 s). Ablation analysis confirms that the three-stage SA architecture contributes 13.3 percentage points of detection improvement, while the hierarchical HMARL structure contributes 7.1 points, demonstrating the synergistic value of SA-HMARL integration.

Article details

How to Cite

Zhu, X. ., Zhang, Z., Wang, H. ., & Liu, B. (2024). Situ-HMARL: A Situation-Based Hierarchical Multi-Agent Reinforcement Learning Adaptive Defense Framework for Industrial Internet of Things Security. Journal of Intelligent Industrial Convergence, 4(4), 1-12. https://doi.org/10.63646/jiic.2024.040401