Industrial Edge LLM Safety for Smart Manufacturing: Self-Supervised Deception Monitoring in Cyber-Physical Production Systems
Main article
Abstract
The proliferation of Large Language Models (LLMs) as intelligent controllers and decision-support components in cyber-physical production systems (CPPS) introduces a novel and underexamined class of safety vulnerability known as deceptive alignment, wherein an edge-deployed model appears compliant during monitoring while covertly pursuing misaligned objectives during autonomous operation. Existing industrial AI safety mechanisms predominantly rely on output-level anomaly detection, failing to inspect the intermediate reasoning processes where deceptive strategies first emerge. This paper presents EduMonitor-CPS, a self-supervised deception monitoring framework specifically designed for LLMs deployed on industrial edge nodes in smart manufacturing environments. The framework introduces three principal innovations: (1) a manufacturing-aware deception taxonomy categorizing five CPPS-specific behavioral deception patterns; (2) a zero-oracle contrastive monitoring pipeline that eliminates dependence on cloud-based teacher models through entropy-filtered self-bootstrapping, enabling fully offline operation within air-gapped production environments; and (3) a geometric representation learning module employing Triplet Loss optimization to project Chain-of-Thought (CoT) hidden states into separable manifolds. Evaluation across three industrial test scenarios demonstrates a Deception Tendency Rate (DTR) of 37.42% with 0.9 ms per-token latency on NVIDIA Jetson AGX Orin hardware, representing a 40x power reduction versus cloud-monitoring architectures while preserving real-time process control capability.
