scholarly journals Integration of Multi-Omics Data to Identify Cancer Biomarkers

2022 ◽  
Vol 15 (1) ◽  
pp. 1-15
Author(s):  
Peng Li ◽  
Bo Sun

A novel method for integrating multi-omics data, including gene expression, copy number variation, DNA methylation, and miRNA data, is proposed to identify biomarkers of cancer prognosis. First, survival analysis was performed for these four types of omics data to obtain survival-related genes. Next, survival-related genes detected in at least two types of omics data were selected as candidate genes. The four types of omics data only composed of candidate genes were subjected to dimension reduction using an autoencoder to obtain a one-dimensional data representation. The mRMR algorithm was used to screen for key genes. This method was applied to lung squamous cell carcinoma and 20 cancer-related genes were identified. Gene function analysis revealed that the genes were related to cancer. Using survival analysis, the genes were verified to distinguish between high- and low-risk groups. These results indicate that the genes can be used as biomarkers for cancer.

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Muta Tah Hira ◽  
M. A. Razzaque ◽  
Claudio Angione ◽  
James Scrivens ◽  
Saladin Sawan ◽  
...  

AbstractCancer is a complex disease that deregulates cellular functions at various molecular levels (e.g., DNA, RNA, and proteins). Integrated multi-omics analysis of data from these levels is necessary to understand the aberrant cellular functions accountable for cancer and its development. In recent years, Deep Learning (DL) approaches have become a useful tool in integrated multi-omics analysis of cancer data. However, high dimensional multi-omics data are generally imbalanced with too many molecular features and relatively few patient samples. This imbalance makes a DL based integrated multi-omics analysis difficult. DL-based dimensionality reduction technique, including variational autoencoder (VAE), is a potential solution to balance high dimensional multi-omics data. However, there are few VAE-based integrated multi-omics analyses, and they are limited to pancancer. In this work, we did an integrated multi-omics analysis of ovarian cancer using the compressed features learned through VAE and an improved version of VAE, namely Maximum Mean Discrepancy VAE (MMD-VAE). First, we designed and developed a DL architecture for VAE and MMD-VAE. Then we used the architecture for mono-omics, integrated di-omics and tri-omics data analysis of ovarian cancer through cancer samples identification, molecular subtypes clustering and classification, and survival analysis. The results show that MMD-VAE and VAE-based compressed features can respectively classify the transcriptional subtypes of the TCGA datasets with an accuracy in the range of 93.2-95.5% and 87.1-95.7%. Also, survival analysis results show that VAE and MMD-VAE based compressed representation of omics data can be used in cancer prognosis. Based on the results, we can conclude that (i) VAE and MMD-VAE outperform existing dimensionality reduction techniques, (ii) integrated multi-omics analyses perform better or similar compared to their mono-omics counterparts, and (iii) MMD-VAE performs better than VAE in most omics dataset.


2020 ◽  
Author(s):  
Changchun Niu ◽  
Di Wu ◽  
Alexander J. Li ◽  
Kevin H. Qin ◽  
Daniel A. Hu ◽  
...  

Abstract Purpose Acute myeloid leukemia (AML) is caused by multiple genetic alterations in the hematopoietic progenitors, and molecular genetic analysis has provided useful information for AML diagnosis and prognosis. However, an integrative understanding about the prognosis value of specific copy number variation (CNV) and CNV-modulated gene expression has been limited. Methods We conducted an integrative analysis of CNV profiling and gene expression using data from the TARGET and TCGA AML cohorts. The CNV data from TCGA were analyzed using the GISTIC. CNV survival analysis and mRNA survival analysis were conducted with the Multivariate Cox proportional hazards regression model using R software with “survminer” and “survival” packages. KEGG cancer panel genes were extracted from the cancer related pathways from Kyoto Encyclopedia of Genes and Genomes (KEGG). The R package “circlize” was used for mapping the CNV genes to chromosomes. Results From this investigation, we observed distinct CNV patterns in the AML risk groups as well as the expression of 251 genes significantly modulated by CNV in both cohorts. There were 102 CNV genes (located at 7q31-34, 16q24) associated with clinical outcomes in AML, which were identified in the TARGET cohort and validated in the TCGA cohort, three of which being miRNA genes (MIR29A, MIR183, MIR335) that overlapped with a KEGG cancer panel. Five genes were identified whose expressions were modulated by CNV and significantly associated with clinical outcomes, and among them, the deletion of SEMA4D and CBFB were found to potentially have protective effects against AML. Moreover, the distribution of CNV in these five CNV-modulated genes was independent of the risk groups, which suggests that they are independent prognosis factors. Conclusion Overall, this study identified 102 CNV genes and five CNV-modulated gene expressions that are crucial for developing new modes of prognosis evaluation and target therapy for AML.


2019 ◽  
Vol 17 (06) ◽  
pp. 1950038
Author(s):  
Peng Li ◽  
Maozu Guo ◽  
Bo Sun

The identification of cancer-related genes is a major research goal, with implications for determining the pathogenesis of cancer and identifying biomarkers for early diagnosis and treatment. In this study, by integrating multi-omics data, including gene expression, DNA copy number variation, DNA methylation, transcription factors, miRNA, and lncRNA data, we propose a method for mining cancer-related genes based on network models. First, using random forest-based feature selection method multi-omics data are integrated to identify key regulatory factors that affect gene expression, and then genome-wide regulatory networks are constructed. Next, by comparing the regulatory networks of key candidate genes in variant samples and non-variant samples, a differential expression regulatory network is generated. The differential network contains a collection of abnormal regulatory genes of key candidate genes. Then, by introducing the functional similarity as a distance metric for gene sets, a density-based clustering method is used to mine gene modules related to cancer. We applied this method to LUSC (lung squamous cell carcinoma) and mined cancer-related gene modules composed of 20 genes. GO function and KEGG pathway analyses indicated that the modules were closely related to cancer. A survival analysis was used to verify that the excavated gene modules can effectively distinguish between high- and low-risk groups. Overall, these results suggest that the proposed method can be used to identify cancer-related gene modules, providing a basis for the development of biomarkers for diagnosis and treatment.


2021 ◽  
Author(s):  
Changchun Niu ◽  
Di Wu ◽  
Alexander J. Li ◽  
Kevin H. Qin ◽  
Daniel A. Hu ◽  
...  

Abstract Background Acute myeloid leukemia (AML) is caused by multiple genetic alterations in the hematopoietic progenitors, and molecular genetic analysis has provided useful information for AML diagnosis and prognosis. However, an integrative understanding of the prognostic value of specific copy number variation (CNV) and CNV-modulated gene expression has been limited. Methods We conducted an integrative analysis of CNV profiling and gene expression using data from the TARGET and TCGA AML cohorts. The CNV data from TCGA were analyzed using the GISTIC and all CNV data by genes on every patient were obtained. CNV survival analysis and mRNA survival analysis were conducted with the Multivariate Cox proportional hazards regression model using R software with “survminer” and “survival” packages. KEGG cancer panel genes were extracted from the cancer-related pathways from Kyoto Encyclopedia of Genes and Genomes (KEGG). The R package “circlize” was used for mapping the CNV genes to chromosomes. Results From this investigation, we observed distinct CNV patterns in the AML risk groups as well as the expression of 251 genes significantly modulated by CNV in both cohorts. There were 102 CNV genes (located at 7q31-34, 16q24) associated with clinical outcomes in AML, which were identified in the TARGET cohort and validated in the TCGA cohort, three of which being miRNA genes (MIR29A, MIR183, MIR335) that overlapped with a KEGG cancer panel. Five genes were identified whose expression was modulated by CNV and significantly associated with clinical outcomes, and among them, the deletion of SEMA4D and CBFB were found to potentially have protective effects against AML. The result was also validated with patient marrow samples. Moreover, the distribution of CNV in these five CNV-modulated genes was independent of the risk groups, which suggests that they are independent prognosis factors. Conclusion Overall, this study identified 102 CNV genes and five CNV-modulated gene expression that is crucial for developing new modes of prognosis evaluation and target therapy for AML.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Qi-Fan Yang ◽  
Di Wu ◽  
Jian Wang ◽  
Li Ba ◽  
Chen Tian ◽  
...  

AbstractLung squamous cell carcinoma (LUSC) possesses a poor prognosis even for stages I–III resected patients. Reliable prognostic biomarkers that can stratify and predict clinical outcomes for stage I–III resected LUSC patients are urgently needed. Based on gene expression of LUSC tissue samples from five public datasets, consisting of 687 cases, we developed an immune-related prognostic model (IPM) according to immune genes from ImmPort database. Then, we comprehensively analyzed the immune microenvironment and mutation burden that are significantly associated with this model. According to the IPM, patients were stratified into high- and low-risk groups with markedly distinct survival benefits. We found that patients with high immune risk possessed a higher proportion of immunosuppressive cells such as macrophages M0, and presented higher expression of CD47, CD73, SIRPA, and TIM-3. Moreover, When further stratified based on the tumor mutation burden (TMB) and risk score, patients with high TMB and low immune risk had a remarkable prolonged overall survival compared to patients with low TMB and high immune risk. Finally, a nomogram combing the IPM with clinical factors was established to provide a more precise evaluation of prognosis. The proposed immune relevant model is a promising biomarker for predicting overall survival in stage I–III LUSC. Thus, it may shed light on identifying patient subset at high risk of adverse prognosis from an immunological perspective.


Cancers ◽  
2021 ◽  
Vol 13 (13) ◽  
pp. 3106
Author(s):  
Yogesh Kalakoti ◽  
Shashank Yadav ◽  
Durai Sundar

The utility of multi-omics in personalized therapy and cancer survival analysis has been debated and demonstrated extensively in the recent past. Most of the current methods still suffer from data constraints such as high-dimensionality, unexplained interdependence, and subpar integration methods. Here, we propose SurvCNN, an alternative approach to process multi-omics data with robust computer vision architectures, to predict cancer prognosis for Lung Adenocarcinoma patients. Numerical multi-omics data were transformed into their image representations and fed into a Convolutional Neural network with a discrete-time model to predict survival probabilities. The framework also dichotomized patients into risk subgroups based on their survival probabilities over time. SurvCNN was evaluated on multiple performance metrics and outperformed existing methods with a high degree of confidence. Moreover, comprehensive insights into the relative performance of various combinations of omics datasets were probed. Critical biological processes, pathways and cell types identified from downstream processing of differentially expressed genes suggested that the framework could elucidate elements detrimental to a patient’s survival. Such integrative models with high predictive power would have a significant impact and utility in precision oncology.


2015 ◽  
Vol 79 (3) ◽  
pp. 453-459 ◽  
Author(s):  
Zachary A. Vesoulis ◽  
Steve M. Liao ◽  
Shamik B. Trivedi ◽  
Nathalie El Ters ◽  
Amit M. Mathur

2021 ◽  
Vol 7 (1) ◽  
Author(s):  
Zeng-Hong Wu ◽  
Yun Tang ◽  
Hong Yu ◽  
Hua-Dong Li

AbstractBreast cancer (BC) affects the breast tissue and is the second most common cause of mortalities among women. Ferroptosis is an iron-dependent cell death mode that is characterized by intracellular accumulation of reactive oxygen species (ROS). We constructed a prognostic multigene signature based on ferroptosis-associated differentially expressed genes (DEGs). Moreover, we comprehensively analyzed the role of ferroptosis-associated miRNAs, lncRNAs, and immune responses. A total of 259 ferroptosis-related genes were extracted. KEGG function analysis of these genes revealed that they were mainly enriched in the HIF-1 signaling pathway, NOD-like receptor signaling pathway, central carbon metabolism in cancer, and PPAR signaling pathway. Fifteen differentially expressed genes (ALOX15, ALOX15B, ANO6, BRD4, CISD1, DRD5, FLT3, G6PD, IFNG, NGB, NOS2, PROM2, SLC1A4, SLC38A1, and TP63) were selected as independent prognostic factors for BC patients. Moreover, T cell functions, including the CCR score, immune checkpoint, cytolytic activity, HLA, inflammation promotion, para-inflammation, T cell co-stimulation, T cell co-inhibition, and type II INF responses were significantly different between the low-risk and high-risk groups of the TCGA cohort. Immune checkpoints between the two groups revealed that the expressions of PDCD-1 (PD-1), CTLA4, LAG3, TNFSF4/14, TNFRSF4/8/9/14/18/25, and IDO1/2 among others were significantly different. A total of 1185 ferroptosis-related lncRNAs and 219 ferroptosis-related miRNAs were also included in this study. From the online database, we identified novel ferroptosis-related biomarkers for breast cancer prognosis. The findings of this study provide new insights into the development of new reliable and accurate cancer treatment options.


Sign in / Sign up

Export Citation Format

Share Document