Integrative Analysis of -Omics Data: A Method Comparison

Author(s):  
Oana A. Tomescu ◽  
Diethard Mattanovich ◽  
Gerhard G. Thallinger
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Erica Ponzi ◽  
Magne Thoresen ◽  
Therese Haugdahl Nøst ◽  
Kajsa Møllersen

Abstract Background Cancer genomic studies often include data collected from several omics platforms. Each omics data source contributes to the understanding of the underlying biological process via source specific (“individual”) patterns of variability. At the same time, statistical associations and potential interactions among the different data sources can reveal signals from common biological processes that might not be identified by single source analyses. These common patterns of variability are referred to as “shared” or “joint”. In this work, we show how the use of joint and individual components can lead to better predictive models, and to a deeper understanding of the biological process at hand. We identify joint and individual contributions of DNA methylation, miRNA and mRNA expression collected from blood samples in a lung cancer case–control study nested within the Norwegian Women and Cancer (NOWAC) cohort study, and we use such components to build prediction models for case–control and metastatic status. To assess the quality of predictions, we compare models based on simultaneous, integrative analysis of multi-source omics data to a standard non-integrative analysis of each single omics dataset, and to penalized regression models. Additionally, we apply the proposed approach to a breast cancer dataset from The Cancer Genome Atlas. Results Our results show how an integrative analysis that preserves both components of variation is more appropriate than standard multi-omics analyses that are not based on such a distinction. Both joint and individual components are shown to contribute to a better quality of model predictions, and facilitate the interpretation of the underlying biological processes in lung cancer development. Conclusions In the presence of multiple omics data sources, we recommend the use of data integration techniques that preserve the joint and individual components across the omics sources. We show how the inclusion of such components increases the quality of model predictions of clinical outcomes.


BMC Genomics ◽  
2015 ◽  
Vol 16 (Suppl 9) ◽  
pp. S4 ◽  
Author(s):  
Min-Seok Kwon ◽  
Yongkang Kim ◽  
Seungyeoun Lee ◽  
Junghyun Namkung ◽  
Taegyun Yun ◽  
...  

2018 ◽  
Author(s):  
Xiaoyu Song ◽  
Jiayi Ji ◽  
Kevin J. Gleason ◽  
John A. Martignetti ◽  
Lin S. Chen ◽  
...  

In this work, we propose iProFun, an integrative analysis tool to screen for Proteogenomic Functional traits perturbed by DNA copy number alterations (CNA) and DNA methylations. The goal is to characterize functional consequences of DNA copy number and methylation alterations in tumors and to facilitate screening for cancer drivers contributing to tumor initiation and progression. Specifically, we consider three functional molecular quantitative traits: mRNA expression levels, global protein abundances, and phosphoprotein abundances. We aim to identify those genes whose CNAs and/or DNA methylations have cis-associations with either some or all three types of molecular traits. In comparison with analyzing each molecular trait separately, the joint modeling of multi-omics data enjoys several benefits: iProFun experienced enhanced power for detecting significant cis-associations shared across different omics data types; and it also achieved better accuracy in inferring cis-associations unique to certain type(s) of molecular trait(s). For example, unique associations of CNA/methylations to global/phospho protein abundances may imply post-translational regulations. We applied iProFun to ovarian high-grade serous carcinoma tumor data from The Cancer Genome Atlas and Clinical Proteomic Tumor Analysis Consortium, and identified CNAs and methylations of 500 and 122 genes, respectively, affecting the cis-functional molecular quantitative traits of the corresponding genes. We observed substantial power gain via the joint analysis of iProFun. For example, iProFun identified 130 genes whose CNAs were associated with phosphoprotein abundances by leveraging mRNA expression levels and global protein abundances. By comparison, analyses based on phosphoprotein data alone identified none. A group of these 130 genes clustered in a small region on Chromosome 14q, harboring the known oncogene, AKT1. In addition, iProFun identified one gene, CANX, whose DNA methylation has a cis-association with its global protein abundances but not its mRNA expression levels. These and other genes identified by iProFun could serve as potential drug targets for ovarian cancer.


2021 ◽  
Author(s):  
Tao Peng ◽  
Gregory M. Chen ◽  
Kai Tan

ABSTRACTSingle-cell omics assays have become essential tools for identifying and characterizing cell types and states of complex tissues. While each single-modality assay reveals distinctive features about the sequenced cells, true multi-omics assays are still in early stage of development. This notion signifies the importance of computationally integrating single-cell omics data that are conducted on various samples across various modalities. In addition, the advent of multiplexed molecular imaging assays has given rise to a need for computational methods for integrative analysis of single-cell imaging and omics data. Here, we present GLUER (inteGrative anaLysis of mUlti-omics at single-cEll Resolution), a flexible tool for integration of single-cell multi-omics data and imaging data. Using multiple true multi-omics data sets as the ground truth, we demonstrate that GLUER achieved significant improvement over existing methods in terms of the accuracy of matching cells across different data modalities resulting in ameliorating downstream analyses such as clustering and trajectory inference. We further demonstrate the broad utility of GLUER for integrating single-cell transcriptomics data with imaging-based spatial proteomics and transcriptomics data. Finally, we extend GLUER to leverage true cell-pair labels when available in true multi-omics data, and show that this approach improves co-embedding and clustering results. With the rapid accumulation of single-cell multi-omics and imaging data, integrated data holds the promise of furthering our understanding of the role of heterogeneity in development and disease.


Sign in / Sign up

Export Citation Format

Share Document