Risk-Stratified Atrial Fibrillation Screening on Consumer Wearables: A Generative Denoising Pipeline with Built-in Reliability Gating
Main article
Abstract
Consumer wearables now offer unprecedented opportunities to screen asymptomatic individuals for atrial fibrillation (AF) using photoplethysmography (PPG), but motion-induced corruption of the optical trace remains the dominant cause of unreliable downstream diagnoses. We describe an end-to-end risk-stratified screening pipeline in which a one-dimensional Pix2Pix generative adversarial network restores noisy wrist-worn PPG segments before they are forwarded to a pretrained AF classifier, and a built-in reliability gate based on the predictive entropy of that classifier rejects samples likely to yield erroneous decisions. Because no ground-truth clean signal exists at deployment, we ground gating reliability in a decision-theoretic notion of cost rather than reconstruction fidelity, and we validate the gate using the Uncertainty Calibration Error against the downstream task. On a wrist-PPG cohort of 136 882 segments derived from a public AF dataset, Gaussian-corrupted inputs reduced classifier AUC from 0.84 to 0.75; GAN restoration recovered AUC to 0.80, and the reliability gate delivered an AUC of 0.85, an F1 of 0.70 and a balanced accuracy of 0.77 on the retained 75 % of segments — matching or exceeding the performance achievable on uncorrupted inputs. The Uncertainty Calibration Error of the gated outputs (0.025) was less than half that observed on noisy inputs (0.055), and entropy values for denoised and noisy versions of the same segment were only moderately correlated (Pearson r = 0.68; Spearman ρ = 0.59), which indicates that the gate is sensitive to artefacts the GAN itself introduces rather than to the underlying measurement quality alone. The framework is model-agnostic, requires no additional supervised retraining of the classifier, and supports privacy-preserving deployment on the device. Risk-stratified gating is therefore a practical mechanism for raising the trustworthiness of AI-driven cardiac screening on consumer wearables without sacrificing population coverage.
