Main article

Rui Chen
School of Health Information Management, Wannan Medical College, Wuhu 241002, China
Yanfang Liu
Department of Computer Science and Technology, Hebei University of Engineering, Handan 056038, China
Peng Zhao*
School of Management Science and Engineering, Shandong University of Finance and Economics, Jinan 250014, China
zhaopengsdufe@sdufe.edu.cn

DOI: https://doi.org/10.63646/datamind.2023.010202

Abstract

Hospital operations generate a continuous stream of heterogeneous digital events — from triage timestamps and bed assignments to laboratory order–result cycles and medication execution logs — yet these events remain scattered across siloed systems: Hospital Information Systems (HIS), Laboratory Information Systems (LIS), Electronic Medical Records (EMR), and pharmacy platforms. This fragmentation prevents the systematic application of process mining, machine learning, and AI-driven workflow analytics at scale. We introduce OpenClinOpsDB, an open, AI-ready relational event database that integrates and harmonises clinical operations data across six core entity types — patient encounters, clinical events, resource logs, medication orders, laboratory results, and staff records — under a unified schema aligned with HL7 FHIR R4 and IEEE XES process-log standards. The database is constructed from four years of anonymised multi-department records at two tertiary care hospitals in China, encompassing 127,483 complete encounter trajectories, 4.1 million timestamped clinical events, 886,214 laboratory order–result pairs, and 1.2 million medication execution records. We report data quality metrics including field completeness, timestamp coherence, coding coverage, and noise rates, and conduct reproducible baseline experiments for two analytically critical tasks: length-of-stay (LOS) prediction and emergency department queue-time prediction. An LSTM model achieves a mean absolute error of 1.19 days for LOS and 12.9 minutes for queue time, establishing competitive benchmarks for future studies. Process trace variant analysis reveals ten dominant encounter pathways accounting for 85.4% of all visits, with mean durations ranging from 1.4 to 48.3 hours, exposing substantial workflow heterogeneity. OpenClinOpsDB, its construction pipeline, field dictionaries, and evaluation scripts are released to support reproducible hospital workflow research.

Article details

How to Cite

Chen, R. ., Liu, Y., & Zhao, P. . (2023). OpenClinOpsDB: An AI-Ready Clinical Operations Database for Hospital Workflow Analytics. DATAMIND, 1(2), 5-15. https://doi.org/10.63646/datamind.2023.010202