Main article

Arjun S. Patel
Department of Computer Science and Engineering, Indian Institute of Technology Delhi, Hauz Khas, New Delhi 110016, India
Sneha V. Kumar*
Department of Computer Science and Engineering, Karunya Institute of Technology and Sciences, Karunya Nagar, Coimbatore 641114, India
sneha.kumar@karunya.edu

DOI: https://doi.org/10.63646/jaiaa.2023.010403

Abstract

Generative models are increasingly used to produce synthetic physiological time series in health-oriented machine learning, whether to denoise wearable recordings, adapt signals across acquisition domains, augment scarce training data, or impute missing segments. Yet the same flexibility that makes these models useful also lets them introduce plausible-looking but misleading artefacts, which is a serious liability when the synthetic signal feeds a clinical decision. This review argues that the trustworthiness of a synthetic output cannot be judged in isolation from the task it is meant to support, and it develops a task-guided confidence scoring perspective that grounds the quality of each generated signal in the expected cost of the downstream decision it influences. We organise the argument around four ideas: that conventional distributional and realism metrics answer the wrong question for deployment; that a useful confidence signal must be per-instance, available before ground truth, and aligned with the decision at hand; that such a signal can be derived from the behaviour of the downstream task model and externally grounded by checking whether higher scores track higher realised decision cost; and that the resulting scores enable principled gating of low-confidence outputs. Using wearable photoplethysmography and atrial-fibrillation screening as a running example, we synthesise reporting strategies across modalities, contrast their properties, and map the deployment, governance, and clinical-translation considerations that determine whether confidence scoring delivers value in practice. The perspective offers a transferable diagnostic for deciding when a synthetic time-series output is safe to use.

Article details

How to Cite

S. Patel, A. ., & V. Kumar, S. (2023). Task-Guided Confidence Scoring for Synthetic Time-Series Outputs in Health-Oriented Machine Learning Systems. Journal of AI Analytics and Applications, 1(4), 36-51. https://doi.org/10.63646/jaiaa.2023.010403