scholarly journals MORONET: Multi-omics Integration via Graph Convolutional Networks for Biomedical Data Classification

2020 ◽  
Author(s):  
Tongxin Wang ◽  
Wei Shao ◽  
Zhi Huang ◽  
Haixu Tang ◽  
Jie Zhang ◽  
...  

ABSTRACTTo fully utilize the advances in omics technologies and achieve a more comprehensive understanding of human diseases, novel computational methods are required for integrative analysis for multiple types of omics data. We present a novel multi-omics integrative method named Multi-Omics gRaph cOnvolutional NETworks (MORONET) for biomedical classification. MORONET jointly explores omics-specific learning and cross-omics correlation learning for effective multi-omics data classification. We demonstrate that MORONET outperforms other state-of-the-art supervised multi-omics integrative analysis approaches from a wide range of biomedical classification applications using mRNA expression data, DNA methylation data, and miRNA expression data. Furthermore, MORONET is able to identify important biomarkers from different omics data types that are related with the investigated diseases.

2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Tongxin Wang ◽  
Wei Shao ◽  
Zhi Huang ◽  
Haixu Tang ◽  
Jie Zhang ◽  
...  

AbstractTo fully utilize the advances in omics technologies and achieve a more comprehensive understanding of human diseases, novel computational methods are required for integrative analysis of multiple types of omics data. Here, we present a novel multi-omics integrative method named Multi-Omics Graph cOnvolutional NETworks (MOGONET) for biomedical classification. MOGONET jointly explores omics-specific learning and cross-omics correlation learning for effective multi-omics data classification. We demonstrate that MOGONET outperforms other state-of-the-art supervised multi-omics integrative analysis approaches from different biomedical classification applications using mRNA expression data, DNA methylation data, and microRNA expression data. Furthermore, MOGONET can identify important biomarkers from different omics data types related to the investigated biomedical problems.


Author(s):  
Takoua Jendoubi

Metabolomics deals with multiple and complex chemical reactions within living organisms and how these are influenced by external or internal perturbations. It lies at the heart of omics profiling technologies not only as the underlying biochemical layer that reflects information expressed by the genome, the transcriptome and the proteome, but also as the closest layer to the phenome. The combination of metabolomics data with the information available from genomics, transcriptomics, and proteomics offers unprecedented possibilities to enhance current understanding of biological functions, elucidate their underlying mechanisms and uncover hidden associations between omics variables. As a result, a vast array of computational tools have been developed to assist with integrative analysis of metabolomics data with different omics. Here, we review and propose five criteria – hypothesis, data types, strategies, study design and study focus – to classify statistical multi-omics data integration approaches into state-of-the-art classes under which all existing statistical methods fall. The purpose of this review is to look at various aspects that lead the choice of the statistical integrative analysis pipeline in terms of the different classes. We will draw a particular attention to metabolomics and genomics data to assist those new to this field in the choice of the integrative analysis pipeline.


2021 ◽  
Vol 12 ◽  
Author(s):  
Gabriel J. Odom ◽  
Antonio Colaprico ◽  
Tiago C. Silva ◽  
X. Steven Chen ◽  
Lily Wang

Recent advances in technology have made multi-omics datasets increasingly available to researchers. To leverage the wealth of information in multi-omics data, a number of integrative analysis strategies have been proposed recently. However, effectively extracting biological insights from these large, complex datasets remains challenging. In particular, matched samples with multiple types of omics data measured on each sample are often required for multi-omics analysis tools, which can significantly reduce the sample size. Another challenge is that analysis techniques such as dimension reductions, which extract association signals in high dimensional datasets by estimating a few variables that explain most of the variations in the samples, are typically applied to whole-genome data, which can be computationally demanding. Here we present pathwayMultiomics, a pathway-based approach for integrative analysis of multi-omics data with categorical, continuous, or survival outcome variables. The input of pathwayMultiomics is pathway p-values for individual omics data types, which are then integrated using a novel statistic, the MiniMax statistic, to prioritize pathways dysregulated in multiple types of omics datasets. Importantly, pathwayMultiomics is computationally efficient and does not require matched samples in multi-omics data. We performed a comprehensive simulation study to show that pathwayMultiomics significantly outperformed currently available multi-omics tools with improved power and well-controlled false-positive rates. In addition, we also analyzed real multi-omics datasets to show that pathwayMultiomics was able to recover known biology by nominating biologically meaningful pathways in complex diseases such as Alzheimer’s disease.


Metabolites ◽  
2021 ◽  
Vol 11 (3) ◽  
pp. 184
Author(s):  
Takoua Jendoubi

Metabolomics deals with multiple and complex chemical reactions within living organisms and how these are influenced by external or internal perturbations. It lies at the heart of omics profiling technologies not only as the underlying biochemical layer that reflects information expressed by the genome, the transcriptome and the proteome, but also as the closest layer to the phenome. The combination of metabolomics data with the information available from genomics, transcriptomics, and proteomics offers unprecedented possibilities to enhance current understanding of biological functions, elucidate their underlying mechanisms and uncover hidden associations between omics variables. As a result, a vast array of computational tools have been developed to assist with integrative analysis of metabolomics data with different omics. Here, we review and propose five criteria—hypothesis, data types, strategies, study design and study focus— to classify statistical multi-omics data integration approaches into state-of-the-art classes under which all existing statistical methods fall. The purpose of this review is to look at various aspects that lead the choice of the statistical integrative analysis pipeline in terms of the different classes. We will draw particular attention to metabolomics and genomics data to assist those new to this field in the choice of the integrative analysis pipeline.


Genes ◽  
2019 ◽  
Vol 10 (2) ◽  
pp. 87 ◽  
Author(s):  
Bilal Mirza ◽  
Wei Wang ◽  
Jie Wang ◽  
Howard Choi ◽  
Neo Christopher Chung ◽  
...  

Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues.


F1000Research ◽  
2020 ◽  
Vol 9 ◽  
pp. 136
Author(s):  
Rutger A. Vos ◽  
Toshiaki Katayama ◽  
Hiroyuki Mishima ◽  
Shin Kawano ◽  
Shuichi Kawashima ◽  
...  

We report on the activities of the 2015 edition of the BioHackathon, an annual event that brings together researchers and developers from around the world to develop tools and technologies that promote the reusability of biological data. We discuss issues surrounding the representation, publication, integration, mining and reuse of biological data and metadata across a wide range of biomedical data types of relevance for the life sciences, including chemistry, genotypes and phenotypes, orthology and phylogeny, proteomics, genomics, glycomics, and metabolomics. We describe our progress to address ongoing challenges to the reusability and reproducibility of research results, and identify outstanding issues that continue to impede the progress of bioinformatics research. We share our perspective on the state of the art, continued challenges, and goals for future research and development for the life sciences Semantic Web.


F1000Research ◽  
2015 ◽  
Vol 4 ◽  
pp. 217 ◽  
Author(s):  
Wolfgang Huber ◽  
Vincent J. Carey ◽  
Sean Davis ◽  
Kasper Daniel Hansen ◽  
Martin Morgan

Bioconductor (bioconductor.org) is a rich source of software and know-how for the integrative analysis of genomic data. The Bioconductor channel in F1000Research provides a forum for task-oriented workflows that each cover a solution to a current, important problem in genome-scale data analysis from end to end, invoking resources from several packages by different authors, often combining multiple `omics data types, and demonstrating integrative analysis and modelling techniques.


Author(s):  
Gary Sutlieff ◽  
Lucy Berthoud ◽  
Mark Stinchcombe

Abstract CBRN (Chemical, Biological, Radiological, and Nuclear) threats are becoming more prevalent, as more entities gain access to modern weapons and industrial technologies and chemicals. This has produced a need for improvements to modelling, detection, and monitoring of these events. While there are currently no dedicated satellites for CBRN purposes, there are a wide range of possibilities for satellite data to contribute to this field, from atmospheric composition and chemical detection to cloud cover, land mapping, and surface property measurements. This study looks at currently available satellite data, including meteorological data such as wind and cloud profiles, surface properties like temperature and humidity, chemical detection, and sounding. Results of this survey revealed several gaps in the available data, particularly concerning biological and radiological detection. The results also suggest that publicly available satellite data largely does not meet the requirements of spatial resolution, coverage, and latency that CBRN detection requires, outside of providing terrain use and building height data for constructing models. Lastly, the study evaluates upcoming instruments, platforms, and satellite technologies to gauge the impact these developments will have in the near future. Improvements in spatial and temporal resolution as well as latency are already becoming possible, and new instruments will fill in the gaps in detection by imaging a wider range of chemicals and other agents and by collecting new data types. This study shows that with developments coming within the next decade, satellites should begin to provide valuable augmentations to CBRN event detection and monitoring. Article Highlights There is a wide range of existing satellite data in fields that are of interest to CBRN detection and monitoring. The data is mostly of insufficient quality (resolution or latency) for the demanding requirements of CBRN modelling for incident control. Future technologies and platforms will improve resolution and latency, making satellite data more viable in the CBRN management field


Sign in / Sign up

Export Citation Format

Share Document