scholarly journals Blockchain for genomics and healthcare: a literature review, current status, classification and open issues

PeerJ ◽  
2021 ◽  
Vol 9 ◽  
pp. e12130
Author(s):  
Beyhan Adanur Dedeturk ◽  
Ahmet Soran ◽  
Burcu Bakir-Gungor

The tremendous boost in the next generation sequencing technologies and in the “omics” technologies resulted in the generation of hundreds of gigabytes of data per day. Nowadays, via integrating -omics data with other data types, such as imaging and electronic health record (EHR) data, panomics studies attempt to identify novel and potentially actionable biomarkers for personalized medicine applications. In this respect, for the accurate analysis of -omics data and EHR, there is a need to establish secure and robust pipelines that take the ethical aspects into consideration, regulate privacy and ownership issues, and data sharing. These days, blockchain technology has picked up significant attention in diverse fields, including genomics, since it offers a new solution for these problems from a different perspective. Blockchain is an immutable transaction ledger, which offers secure and distributed system without a central authority. Within the system, each transaction can be expressed with cryptographically signed blocks, and the verification of transactions is performed by the users of the network. In this review, firstly, we aim to highlight the challenges of EHR and genomic data sharing. Secondly, we attempt to answer “Why” or “Why not” the blockchain technology is suitable for genomics and healthcare applications in detail. Thirdly, we elucidate the general blockchain structure based on the Ethereum, which is a more suitable technology for the genomic data sharing platforms. Fourthly, we review current blockchain-based EHR and genomic data sharing platforms, evaluate the advantages and disadvantages of these applications, and classify these applications using different metrics. Finally, we conclude by discussing the open issues and introducing our suggestion on the topic. In summary, to facilitate the diagnosis, monitoring and therapy of diseases with the effective analysis of -omics data with other available data types, through this review, we put forward the possible implications of the blockchain technology to life sciences and healthcare.

2021 ◽  
Author(s):  
Lewis Jonathan Dursi ◽  
Zoltan Bozoky ◽  
Richard de Borja ◽  
Jimmy Li ◽  
David Bujold ◽  
...  

Rapid expansions of bioinformatics and computational biology have broadened the collection and use of -omics data including genomic, transcriptomic, methylomic and a myriad of other health data types, in the clinic and the laboratory. Both clinical and research uses of such data require co-analysis with large datasets, for which participant privacy and the need for data custodian controls must remain paramount. This is particularly challenging in multi-jurisdictional settings, such as Canada, where health privacy and security requirements are often heterogeneous. Data federation presents a solution to this, allowing for integration and analysis of large datasets from various sites while abiding by local policies. The Canadian Distributed Infrastructure for Genomics platform (CanDIG) enables federated querying and analysis of -omics and health data while keeping that data local and under local control. It builds upon existing infrastructures to connect five health and research institutions across Canada, relies heavily on standards and tooling brought together by the Global Alliance for Genomics and Health (GA4GH), implements a clear division of responsibilities among its participants and adheres to international data sharing standards. Participating researchers and clinicians can therefore contribute to and quickly access a critical mass of -omics data across a national network in a manner that takes into account the multi-jurisdictional nature of our privacy and security policies. Through this, CanDIG gives medical and research communities the tools needed to use and analyze the ever-growing amount of -omics data available to them in order to improve our understanding and treatment of various conditions and diseases. CanDIG is being used to make genomic and phenotypic data available for querying across Canada as part of data sharing for five leading pan-Canadian projects including the Terry Fox Comprehensive Cancer Care Centre Consortium Network (TF4CN) and Terry Fox PRecision Oncology For Young peopLE (PROFYLE), and making data from provincial projects such as POG (Personalized Onco-Genomics) more widely available.


Author(s):  
Javan Carter ◽  
Garth Spellman ◽  
Rebecca Kimball ◽  
Rebecca Safran ◽  
Erik Funk ◽  
...  

Despite the increasing feasibility of sequencing whole genomes from diverse taxa, a persistent problem in phylogenomics is the selection of appropriate markers or loci for a given taxonomic group or research question. In this review, we aim to streamline the decision-making process for selecting data types used in phylogenomic studies by providing an introduction to commonly used types of genomic data, their characteristics, and their associated uses in phylogenomics. Specifically, we review the uses and features of ultraconserved elements (UCEs; including flanking regions), anchored hybrid enrichment (AHE) loci, conserved non-exonic elements (CNEE), untranslated regions (UTRs), introns, exons, mitochondrial DNA (mtDNA), single nucleotide polymorphisms (SNPs), and anonymous regions (nonspecific regions of the genome that are evenly or randomly distributed across the genome). These various data types differ in their mutation rates, likelihood of neutrality or of being strongly linked to loci under selection, and mode of inheritance, each of which are important considerations in phylogenomic reconstruction. These features give each genomic region or data type important advantages and disadvantages, depending on the biological question, number of taxa, evolutionary timescale, and analytical methods used. We provide a clear and concise outline (Table 1) as a resource to efficiently consider relevant and key aspects of each data type in order. As there are a number of factors to consider when designing phylogenomic studies, this review may serve as a primer when weighing options between multiple potential phylogenomic data types.


2020 ◽  
Author(s):  
Karen Yeung

BACKGROUND Academic literature highlights the potential benefits of blockchain to transform healthcare, focusing on its potential seamlessly and securely to integrate existing ‘data silos’ while enabling patients to exercise automated, fine-grained control over access to their Electronic Health Records (EHRs). Yet no serious scholarly attempt has been made to assess the extent to which these technologies have in fact been applied to real-world healthcare contexts. OBJECTIVE The primary aim of this paper is to critically investigate the healthcare sector’s actual engagement and experience of blockchain technologies to date to assess the extent to which the potential for blockchain technologies to transform healthcare highlighted in academic literature is likely to be realised in healthcare practice. METHODS This mixed-methods study entailed a series of iterative, in-depth, theoretically oriented desk-based investigations and two focus-group investigations. It built on findings of a companion research study documenting real-world engagement with blockchain technologies in healthcare. Data was sourced from academic and grey literature drawn from multiple disciplinary perspectives concerned with the configuration, design and functionality of blockchain technologies. The analysis proceeded in three stages. First, it undertook a qualitative investigation of observed patterns of blockchain for healthcare engagement to identify the application domains, data-sharing problems, and the challenges encountered to date. Secondly, it critically compared these experiences of with claims about blockchain's potential benefits in healthcare. Thirdly, it developed a theoretical account of challenges that arise in implementing blockchain in healthcare contexts, thus providing a firmer foundation for appraising its future prospects for healthcare. RESULTS Healthcare organisations have actively experimented with blockchain technologies since 2016, and have demonstrated proof of concept for several applications (‘use cases’) primarily concerned with administrative data and to facilitate medical research by enabling algorithmic models to be trained on multiple disparately located sets of patient data in a secure, privacy-preserving manner. Yet blockchain technology is yet to be implemented at scale in healthcare, remaining largely in its infancy. These early experiences of blockchain technologies have demonstrated blockchain’s potential to generate meaningful value to healthcare by facilitating data sharing between organisations in circumstances where computational trust can overcome a lack of social trust that might otherwise prevent valuable cooperation. Although there are genuine prospects of utilising blockchain to bring about positive transformation in healthcare, the successful development of blockchain for healthcare applications face a number of very significant, multi-dimensional and highly complex challenges. Early experience suggests that blockchain is unlikely to rapidly and radically revolutionise healthcare. CONCLUSIONS The successful development of blockchain for healthcare applications face numerous significant, multi-dimensional and complex challenges which will not be easily overcome, suggesting that blockchain technologies are unlikely to revolutionise healthcare in the near future. CLINICALTRIAL


Genes ◽  
2021 ◽  
Vol 12 (12) ◽  
pp. 1872
Author(s):  
Yingxia Li ◽  
Ulrich Mansmann ◽  
Shangming Du ◽  
Roman Hornung

Lung adenocarcinoma (LUAD) is a common and very lethal cancer. Accurate staging is a prerequisite for its effective diagnosis and treatment. Therefore, improving the accuracy of the stage prediction of LUAD patients is of great clinical relevance. Previous works have mainly focused on single genomic data information or a small number of different omics data types concurrently for generating predictive models. A few of them have considered multi-omics data from genome to proteome. We used a publicly available dataset to illustrate the potential of multi-omics data for stage prediction in LUAD. In particular, we investigated the roles of the specific omics data types in the prediction process. We used a self-developed method, Omics-MKL, for stage prediction that combines an existing feature ranking technique Minimum Redundancy and Maximum Relevance (mRMR), which avoids redundancy among the selected features, and multiple kernel learning (MKL), applying different kernels for different omics data types. Each of the considered omics data types individually provided useful prediction results. Moreover, using multi-omics data delivered notably better results than using single-omics data. Gene expression and methylation information seem to play vital roles in the staging of LUAD. The Omics-MKL method retained 70 features after the selection process. Of these, 21 (30%) were methylation features and 34 (48.57%) were gene expression features. Moreover, 18 (25.71%) of the selected features are known to be related to LUAD, and 29 (41.43%) to lung cancer in general. Using multi-omics data from genome to proteome for predicting the stage of LUAD seems promising because each omics data type may improve the accuracy of the predictions. Here, methylation and gene expression data may play particularly important roles.


2019 ◽  
Author(s):  
Xiao-Ling Jin ◽  
Miao Zhang ◽  
Zhongyun Zhou ◽  
Xiaoyu Yu

BACKGROUND The rapid development of genetic and genomic technologies, such as next-generation sequencing and genome editing, has made disease treatment much more precise and effective. The technologies’ value can only be realized by the aggregation and analysis of people’s genomic and health data. However, the collection and sharing of genomic data has many obstacles, including low data quality, information islands, tampering distortions, missing records, leaking of private data, and gray data transactions. OBJECTIVE This study aimed to prove that emerging blockchain technology provides a solution for the protection and management of sensitive personal genomic data because of its decentralization, traceability, encryption algorithms, and antitampering features. METHODS This paper describes the case of a blockchain-based genomic big data platform, LifeCODE.ai, to illustrate the means by which blockchain enables the storage and management of genomic data from the perspectives of data ownership, data sharing, and data security. RESULTS Blockchain opens up new avenues for dealing with data ownership, data sharing, and data security issues in genomic big data platforms and realizes the psychological empowerment of individuals in the platform. CONCLUSIONS The blockchain platform provides new possibilities for the management and security of genetic data and can help realize the psychological empowerment of individuals in the process, and consequently, the effects of data self-governance, incentive-sharing, and security improvement can be achieved. However, there are still some problems in the blockchain that have not been solved, and which require continuous in-depth research and innovation in the future.


Author(s):  
Zhuohui Wei ◽  
Yue Zhang ◽  
Wanlin Weng ◽  
Jiazhou Chen ◽  
Hongmin Cai

Abstract The significance of pan-cancer categories has recently been recognized as widespread in cancer research. Pan-cancer categorizes a cancer based on its molecular pathology rather than an organ. The molecular similarities among multi-omics data found in different cancer types can play several roles in both biological processes and therapeutic developments. Therefore, an integrated analysis for various genomic data is frequently used to reveal novel genetic and molecular mechanisms. However, a variety of algorithms for multi-omics clustering have been proposed in different fields. The comparison of different computational clustering methods in pan-cancer analysis performance remains unclear. To increase the utilization of current integrative methods in pan-cancer analysis, we first provide an overview of five popular computational integrative tools: similarity network fusion, integrative clustering of multiple genomic data types (iCluster), cancer integration via multi-kernel learning (CIMLR), perturbation clustering for data integration and disease subtyping (PINS) and low-rank clustering (LRACluster). Then, a priori interactions in multi-omics data were incorporated to detect prominent molecular patterns in pan-cancer data sets. Finally, we present comparative assessments of these methods, with discussion over key issues in applying these algorithms. We found that all five methods can identify distinct tumor compositions. The pan-cancer samples can be reclassified into several groups by different proportions. Interestingly, each method can classify the tumors into categories that are different from original cancer types or subtypes, especially for ovarian serous cystadenocarcinoma (OV) and breast invasive carcinoma (BRCA) tumors. In addition, all clusters of the five computational methods show notable prognostic values. Furthermore, both the 9 recurrent differential genes and the 15 common pathway characteristics were identified across all the methods. The results and discussion can help the community select appropriate integrative tools according to different research tasks or aims in pan-cancer analysis.


F1000Research ◽  
2015 ◽  
Vol 4 ◽  
pp. 217 ◽  
Author(s):  
Wolfgang Huber ◽  
Vincent J. Carey ◽  
Sean Davis ◽  
Kasper Daniel Hansen ◽  
Martin Morgan

Bioconductor (bioconductor.org) is a rich source of software and know-how for the integrative analysis of genomic data. The Bioconductor channel in F1000Research provides a forum for task-oriented workflows that each cover a solution to a current, important problem in genome-scale data analysis from end to end, invoking resources from several packages by different authors, often combining multiple `omics data types, and demonstrating integrative analysis and modelling techniques.


10.2196/13587 ◽  
2019 ◽  
Vol 21 (9) ◽  
pp. e13587 ◽  
Author(s):  
Xiao-Ling Jin ◽  
Miao Zhang ◽  
Zhongyun Zhou ◽  
Xiaoyu Yu

Background The rapid development of genetic and genomic technologies, such as next-generation sequencing and genome editing, has made disease treatment much more precise and effective. The technologies’ value can only be realized by the aggregation and analysis of people’s genomic and health data. However, the collection and sharing of genomic data has many obstacles, including low data quality, information islands, tampering distortions, missing records, leaking of private data, and gray data transactions. Objective This study aimed to prove that emerging blockchain technology provides a solution for the protection and management of sensitive personal genomic data because of its decentralization, traceability, encryption algorithms, and antitampering features. Methods This paper describes the case of a blockchain-based genomic big data platform, LifeCODE.ai, to illustrate the means by which blockchain enables the storage and management of genomic data from the perspectives of data ownership, data sharing, and data security. Results Blockchain opens up new avenues for dealing with data ownership, data sharing, and data security issues in genomic big data platforms and realizes the psychological empowerment of individuals in the platform. Conclusions The blockchain platform provides new possibilities for the management and security of genetic data and can help realize the psychological empowerment of individuals in the process, and consequently, the effects of data self-governance, incentive-sharing, and security improvement can be achieved. However, there are still some problems in the blockchain that have not been solved, and which require continuous in-depth research and innovation in the future.


2018 ◽  
Vol 3 (1) ◽  
pp. 22-32 ◽  
Author(s):  
Ernest Ezema ◽  
Azizol Abdullah ◽  
Nor Fazlida Binti Mohd

The concept of the Internet of Things (IoT) has evolved over time. The introduction of the Internet of Things and Services into the manufacturing environment has ushered in a fourth industrial revolution: Industry 4.0. It is no doubt that the world is undergoing constant transformations that somehow change the trajectory and history of humanity. We can illustrate this with the first and second industrial revolutions and the information revolution. IoT is a paradigm based on the internet that comprises many interconnected technologies like RFID (Radio Frequency Identification) and WSAN (Wireless Sensor and Actor Networks) to exchange information. The current needs for better control, monitoring and management in many areas, and the ongoing research in this field, have originated the appearance and creation of multiple systems like smart-home, smart-city and smart-grid. The IoT services can have centralized or distributed architecture. The centralized approach provides is where central entities acquire, process, and provide information while the distributed architectures, is where entities at the edge of the network exchange information and collaborate with each other in a dynamic way. To understand the two approaches, it is necessary to know its advantages and disadvantages especially in terms of security and privacy issues. This paper shows that the distributed approach has various challenges that need to be solved. But also, various interesting properties and strengths. In this paper we present the main research challenges and the existing solutions in the field of IoT security, identifying open issues, the industrial revolution and suggesting some hints for future research.


Diagnostics ◽  
2021 ◽  
Vol 11 (3) ◽  
pp. 551
Author(s):  
Chris Boyd ◽  
Greg Brown ◽  
Timothy Kleinig ◽  
Joseph Dawson ◽  
Mark D. McDonnell ◽  
...  

Research into machine learning (ML) for clinical vascular analysis, such as those useful for stroke and coronary artery disease, varies greatly between imaging modalities and vascular regions. Limited accessibility to large diverse patient imaging datasets, as well as a lack of transparency in specific methods, are obstacles to further development. This paper reviews the current status of quantitative vascular ML, identifying advantages and disadvantages common to all imaging modalities. Literature from the past 8 years was systematically collected from MEDLINE® and Scopus database searches in January 2021. Papers satisfying all search criteria, including a minimum of 50 patients, were further analysed and extracted of relevant data, for a total of 47 publications. Current ML image segmentation, disease risk prediction, and pathology quantitation methods have shown sensitivities and specificities over 70%, compared to expert manual analysis or invasive quantitation. Despite this, inconsistencies in methodology and the reporting of results have prevented inter-model comparison, impeding the identification of approaches with the greatest potential. The clinical potential of this technology has been well demonstrated in Computed Tomography of coronary artery disease, but remains practically limited in other modalities and body regions, particularly due to a lack of routine invasive reference measurements and patient datasets.


Sign in / Sign up

Export Citation Format

Share Document