Big Data Governance for Multi-Omics Data Sharing: A Blockchain, Smart Contract, and Off-Chain Storage Framework
Main article
Abstract
Modern bioinformatics has entered a multi-omics era in which genomic, transcriptomic, proteomic, and metabolomic datasets accumulate at unprecedented velocity, volume, and variety. Conventional centralized governance — institutional databases protected by role-based access control — struggles with single points of failure, opaque consent enforcement, weak provenance, and brittle interoperability across jurisdictions. Blockchain technology has been proposed as an alternative substrate for trustworthy multi-omics data sharing, but the literature remains fragmented across isolated mechanisms (immutability, smart contracts, on-chain storage) without a coherent system view. This article systematically reviews 82 peer-reviewed studies published between 2017 and 2025, indexed in Scopus, IEEE Xplore, ScienceDirect, SpringerLink, and the ACM Digital Library, using a five-stage screening protocol and a five-question quality assessment rubric. Building on the synthesis, we propose a six-layer architectural framework that combines a permissioned blockchain ledger, smart-contract-based consent and access control, privacy-preserving cryptography (zero-knowledge proofs, homomorphic encryption, differential privacy), decentralized identity, off-chain storage on the InterPlanetary File System, and native interoperability with HL7 FHIR-compliant electronic health records. A multi-criterion comparison shows that Practical Byzantine Fault Tolerance is best suited to the latency, throughput, and energy constraints of multi-omics workflows, outperforming Proof-of-Work and Proof-of-Stake on five of six evaluation dimensions. Compared with traditional security baselines, blockchain delivers measurable advantages in tamper-resistance, provenance, and patient-centric consent, but does not universally dominate on confidentiality and scalability. The framework offers a practical roadmap for big-data governance in life-science research while highlighting open problems in standardization, regulatory alignment, and energy efficiency.
