Artificial Intelligence in Clinical Decision Support: A Systematic Review of Deep Learning Applications, Performance Benchmarks, and Implementation Challenges in Healthcare
Main article
Abstract
Background: Artificial intelligence (AI), particularly deep learning, has emerged as a transformative technology in clinical decision support, demonstrating diagnostic accuracy that rivals or surpasses trained clinicians across several specialties. However, evidence regarding real-world implementation challenges, performance variability, and safety remains heterogeneous. Objective: To systematically review and meta-analytically synthesise the evidence on deep learning-based clinical decision support systems (CDSS) across medical specialties, evaluate performance benchmarks, and characterise implementation barriers and facilitators. Methods: We searched PubMed, IEEE Xplore, Scopus, ACM Digital Library, and CINAHL from January 2015 to December 2024. Eligible studies reported AI-based diagnostic or prognostic models validated on independent clinical datasets. Study quality was assessed using PROBAST. Meta-analyses were performed for AUC-ROC across oncology, radiology, cardiology, and neurology domains. Results: Eighty-seven studies met inclusion criteria (n = 1,247,563 total patients). Pooled AUC-ROC across all domains was 0.913 (95% CI: 0.902–0.924). Deep learning models significantly outperformed conventional approaches in radiology (AUC 0.944 vs. 0.836, p < 0.001) and oncology (AUC 0.931 vs. 0.824, p < 0.001). Key implementation barriers included lack of external validation (61.4% of studies), dataset heterogeneity, regulatory uncertainty, and limited explainability. Conclusions: AI-based CDSS demonstrates high diagnostic accuracy across specialties, but widespread clinical adoption requires investment in prospective external validation, explainability frameworks, and equity-aware model development.
