scholarly journals Synergistic Effects of Different Levels of Genomic Data for the Staging of Lung Adenocarcinoma: An Illustrative Study

Genes ◽  
2021 ◽  
Vol 12 (12) ◽  
pp. 1872
Author(s):  
Yingxia Li ◽  
Ulrich Mansmann ◽  
Shangming Du ◽  
Roman Hornung

Lung adenocarcinoma (LUAD) is a common and very lethal cancer. Accurate staging is a prerequisite for its effective diagnosis and treatment. Therefore, improving the accuracy of the stage prediction of LUAD patients is of great clinical relevance. Previous works have mainly focused on single genomic data information or a small number of different omics data types concurrently for generating predictive models. A few of them have considered multi-omics data from genome to proteome. We used a publicly available dataset to illustrate the potential of multi-omics data for stage prediction in LUAD. In particular, we investigated the roles of the specific omics data types in the prediction process. We used a self-developed method, Omics-MKL, for stage prediction that combines an existing feature ranking technique Minimum Redundancy and Maximum Relevance (mRMR), which avoids redundancy among the selected features, and multiple kernel learning (MKL), applying different kernels for different omics data types. Each of the considered omics data types individually provided useful prediction results. Moreover, using multi-omics data delivered notably better results than using single-omics data. Gene expression and methylation information seem to play vital roles in the staging of LUAD. The Omics-MKL method retained 70 features after the selection process. Of these, 21 (30%) were methylation features and 34 (48.57%) were gene expression features. Moreover, 18 (25.71%) of the selected features are known to be related to LUAD, and 29 (41.43%) to lung cancer in general. Using multi-omics data from genome to proteome for predicting the stage of LUAD seems promising because each omics data type may improve the accuracy of the predictions. Here, methylation and gene expression data may play particularly important roles.

Genes ◽  
2019 ◽  
Vol 10 (3) ◽  
pp. 200 ◽  
Author(s):  
Mingxin Tao ◽  
Tianci Song ◽  
Wei Du ◽  
Siyu Han ◽  
Chunman Zuo ◽  
...  

It is very significant to explore the intrinsic differences in breast cancer subtypes. These intrinsic differences are closely related to clinical diagnosis and designation of treatment plans. With the accumulation of biological and medicine datasets, there are many different omics data that can be viewed in different aspects. Combining these multiple omics data can improve the accuracy of prediction. Meanwhile; there are also many different databases available for us to download different types of omics data. In this article, we use estrogen receptor (ER), progesterone receptor (PR), human epidermal growth factor receptor 2 (HER2) to define breast cancer subtypes and classify any two breast cancer subtypes using SMO-MKL algorithm. We collected mRNA data, methylation data and copy number variation (CNV) data from TCGA to classify breast cancer subtypes. Multiple Kernel Learning (MKL) is employed to use these omics data distinctly. The result of using three omics data with multiple kernels is better than that of using single omics data with multiple kernels. Furthermore; these significant genes and pathways discovered in the feature selection process are also analyzed. In experiments; the proposed method outperforms other state-of-the-art methods and has abundant biological interpretations.


PeerJ ◽  
2021 ◽  
Vol 9 ◽  
pp. e12130
Author(s):  
Beyhan Adanur Dedeturk ◽  
Ahmet Soran ◽  
Burcu Bakir-Gungor

The tremendous boost in the next generation sequencing technologies and in the “omics” technologies resulted in the generation of hundreds of gigabytes of data per day. Nowadays, via integrating -omics data with other data types, such as imaging and electronic health record (EHR) data, panomics studies attempt to identify novel and potentially actionable biomarkers for personalized medicine applications. In this respect, for the accurate analysis of -omics data and EHR, there is a need to establish secure and robust pipelines that take the ethical aspects into consideration, regulate privacy and ownership issues, and data sharing. These days, blockchain technology has picked up significant attention in diverse fields, including genomics, since it offers a new solution for these problems from a different perspective. Blockchain is an immutable transaction ledger, which offers secure and distributed system without a central authority. Within the system, each transaction can be expressed with cryptographically signed blocks, and the verification of transactions is performed by the users of the network. In this review, firstly, we aim to highlight the challenges of EHR and genomic data sharing. Secondly, we attempt to answer “Why” or “Why not” the blockchain technology is suitable for genomics and healthcare applications in detail. Thirdly, we elucidate the general blockchain structure based on the Ethereum, which is a more suitable technology for the genomic data sharing platforms. Fourthly, we review current blockchain-based EHR and genomic data sharing platforms, evaluate the advantages and disadvantages of these applications, and classify these applications using different metrics. Finally, we conclude by discussing the open issues and introducing our suggestion on the topic. In summary, to facilitate the diagnosis, monitoring and therapy of diseases with the effective analysis of -omics data with other available data types, through this review, we put forward the possible implications of the blockchain technology to life sciences and healthcare.


2014 ◽  
Vol 22 (1) ◽  
pp. 109-120 ◽  
Author(s):  
Dokyoon Kim ◽  
Je-Gun Joung ◽  
Kyung-Ah Sohn ◽  
Hyunjung Shin ◽  
Yu Rang Park ◽  
...  

Abstract Objective Cancer can involve gene dysregulation via multiple mechanisms, so no single level of genomic data fully elucidates tumor behavior due to the presence of numerous genomic variations within or between levels in a biological system. We have previously proposed a graph-based integration approach that combines multi-omics data including copy number alteration, methylation, miRNA, and gene expression data for predicting clinical outcome in cancer. However, genomic features likely interact with other genomic features in complex signaling or regulatory networks, since cancer is caused by alterations in pathways or complete processes. Methods Here we propose a new graph-based framework for integrating multi-omics data and genomic knowledge to improve power in predicting clinical outcomes and elucidate interplay between different levels. To highlight the validity of our proposed framework, we used an ovarian cancer dataset from The Cancer Genome Atlas for predicting stage, grade, and survival outcomes. Results Integrating multi-omics data with genomic knowledge to construct pre-defined features resulted in higher performance in clinical outcome prediction and higher stability. For the grade outcome, the model with gene expression data produced an area under the receiver operating characteristic curve (AUC) of 0.7866. However, models of the integration with pathway, Gene Ontology, chromosomal gene set, and motif gene set consistently outperformed the model with genomic data only, attaining AUCs of 0.7873, 0.8433, 0.8254, and 0.8179, respectively. Conclusions Integrating multi-omics data and genomic knowledge to improve understanding of molecular pathogenesis and underlying biology in cancer should improve diagnostic and prognostic indicators and the effectiveness of therapies.


Author(s):  
Zhuohui Wei ◽  
Yue Zhang ◽  
Wanlin Weng ◽  
Jiazhou Chen ◽  
Hongmin Cai

Abstract The significance of pan-cancer categories has recently been recognized as widespread in cancer research. Pan-cancer categorizes a cancer based on its molecular pathology rather than an organ. The molecular similarities among multi-omics data found in different cancer types can play several roles in both biological processes and therapeutic developments. Therefore, an integrated analysis for various genomic data is frequently used to reveal novel genetic and molecular mechanisms. However, a variety of algorithms for multi-omics clustering have been proposed in different fields. The comparison of different computational clustering methods in pan-cancer analysis performance remains unclear. To increase the utilization of current integrative methods in pan-cancer analysis, we first provide an overview of five popular computational integrative tools: similarity network fusion, integrative clustering of multiple genomic data types (iCluster), cancer integration via multi-kernel learning (CIMLR), perturbation clustering for data integration and disease subtyping (PINS) and low-rank clustering (LRACluster). Then, a priori interactions in multi-omics data were incorporated to detect prominent molecular patterns in pan-cancer data sets. Finally, we present comparative assessments of these methods, with discussion over key issues in applying these algorithms. We found that all five methods can identify distinct tumor compositions. The pan-cancer samples can be reclassified into several groups by different proportions. Interestingly, each method can classify the tumors into categories that are different from original cancer types or subtypes, especially for ovarian serous cystadenocarcinoma (OV) and breast invasive carcinoma (BRCA) tumors. In addition, all clusters of the five computational methods show notable prognostic values. Furthermore, both the 9 recurrent differential genes and the 15 common pathway characteristics were identified across all the methods. The results and discussion can help the community select appropriate integrative tools according to different research tasks or aims in pan-cancer analysis.


2017 ◽  
Vol 15 (01) ◽  
pp. 1650037 ◽  
Author(s):  
Tianci Song ◽  
Yan Wang ◽  
Wei Du ◽  
Sha Cao ◽  
Yuan Tian ◽  
...  

Breast cancer histologic grade represents the morphological assessment of the tumor’s malignancy and aggressiveness, which is vital in clinically planning treatment and estimating prognosis for patients. Therefore, the prediction of breast cancer grade can markedly elevate the detection of early breast cancer and efficiently guide its treatment. With the advent of high-throughput profiling technology, a large number of data of different types are rapidly generated, and each data provides its unique biological insight. Although many researches focused on cancer grade prediction, hardly most of them attempted to integrate multiple data types, by which we cannot only improve and boost results obtained from learning method, but also have a good understanding or explanation of biological issues. In this paper, we take advantage of a sophisticated supervised learning method called multiple kernel learning (MKL) to design a breast cancer grading predictor fusing heterogeneous data for classification of breast cancer histopathology. Furthermore, we modify our model by involving biological pathway information. The new model can evaluate the significance of various pathways in which differential expression genes fall between different breast cancer grades. The merits of the novel model are lucubration in bridging between omics data and various phenotypes of breast cancer grades, and providing an auxiliary method integrating omics data of cancer mechanism research. In experiments, the proposed method outperforms other state-of-the-art methods and has abundant biological interpretation in explaining differences between breast cancer grades.


F1000Research ◽  
2015 ◽  
Vol 4 ◽  
pp. 217 ◽  
Author(s):  
Wolfgang Huber ◽  
Vincent J. Carey ◽  
Sean Davis ◽  
Kasper Daniel Hansen ◽  
Martin Morgan

Bioconductor (bioconductor.org) is a rich source of software and know-how for the integrative analysis of genomic data. The Bioconductor channel in F1000Research provides a forum for task-oriented workflows that each cover a solution to a current, important problem in genome-scale data analysis from end to end, invoking resources from several packages by different authors, often combining multiple `omics data types, and demonstrating integrative analysis and modelling techniques.


2021 ◽  
Author(s):  
Jake Crawford ◽  
Brock C Christensen ◽  
Maria Chikina ◽  
Casey S Greene

In studies of cellular function in cancer, researchers are increasingly able to choose from many -omics assays as functional readouts. Choosing the correct readout for a given study can be difficult, and which layer of cellular function is most suitable to capture the relevant signal may be unclear. In this study, we consider prediction of cancer mutation status (presence or absence) from functional -omics data as a representative problem. Since functional signatures of cancer mutation have been identified across many data types, this problem presents an opportunity to quantify and compare the ability of different -omics readouts to capture signals of dysregulation in cancer. The TCGA Pan-Cancer Atlas contains genetic alteration data including somatic mutations and copy number variants (CNVs), as well as several -omics data types. From TCGA, we focus on RNA sequencing, DNA methylation arrays, reverse phase protein arrays (RPPA), microRNA, and somatic mutational signatures as -omics readouts. Across a collection of cancer-associated genetic alterations, RNA sequencing and DNA methylation were the most effective predictors of alteration state. Surprisingly, we found that for most alterations, they were approximately equally effective predictors. The target gene was the primary driver of performance, rather than the data type, and there was little difference between the top data types for the majority of genes. We also found that combining data types into a single multi-omics model often provided little or no improvement in predictive ability over the best individual data type. Based on our results, for the design of studies focused on the functional outcomes of cancer mutations, we recommend focusing on gene expression or DNA methylation as first-line readouts.


BMC Cancer ◽  
2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Xiaodong Yang ◽  
Yuexin Zheng ◽  
Zhihai Han ◽  
Xiliang Zhang

Abstract Background As a marker of differentiation, Killer cell lectin like receptor G1 (KLRG1) plays an inhibitory role in human NK cells and T cells. However, its clinical role remains inexplicit. This work intended to investigate the predictive ability of KLRG1 on the efficacy of immune-checkpoint inhibitor in the treatment of lung adenocarcinoma (LUAD), as well as contribute to the possible molecular mechanisms of KLRG1 on LUAD development. Methods Using data from the Gene Expression Omnibus, the Cancer Genome Atlas and the Genotype-Tissue Expression, we compared the expression of KLRG1 and its related genes Bruton tyrosine kinase (BTK), C-C motif chemokine receptor 2 (CCR2), Scm polycomb group protein like 4 (SCML4) in LUAD and normal lung tissues. We also established stable LUAD cell lines with KLRG1 gene knockdown and investigated the effect of KLRG1 knockdown on tumor cell proliferation. We further studied the prognostic value of the four factors in terms of overall survival (OS) in LUAD. Using data from the Gene Expression Omnibus, we further investigated the expression of KLRG1 in the patients with different responses after immunotherapy. Results The expression of KLRG1, BTK, CCR2 and SCML4 was significantly downregulated in LUAD tissues compared to normal controls. Knockdown of KLRG1 promoted the proliferation of A549 and H1299 tumor cells. And low expression of these four factors was associated with unfavorable overall survival in patients with LUAD. Furthermore, low expression of KLRG1 also correlated with poor responses to immunotherapy in LUAD patients. Conclusion Based on these findings, we inferred that KLRG1 had significant correlation with immunotherapy response. Meanwhile, KLRG1, BTK, CCR2 and SCML4 might serve as valuable prognostic biomarkers in LUAD.


2021 ◽  
Vol 104 (1) ◽  
pp. 003685042199727
Author(s):  
Xinyu Wang ◽  
Jiaojiao Yang ◽  
Xueren Gao

Lung adenocarcinoma (LUAD) is the most common histological type of lung cancer, comprising around 40% of all lung cancer. Until now, the pathogenesis of LUAD has not been fully elucidated. In the current study, we comprehensively analyzed the dysregulated genes in lung adenocarcinoma by mining public datasets. Two sets of gene expression datasets were obtained from the Gene Expression Omnibus (GEO) database. The dysregulated genes were identified by using the GEO2R online tool, and analyzed by R packages, Cytoscape software, STRING, and GPEIA online tools. A total of 275 common dysregulated genes were identified in two independent datasets, including 54 common up-regulated and 221 common down-regulated genes in LUAD. Gene Ontology (GO) enrichment analysis showed that these dysregulated genes were significantly enriched in 258 biological processes (BPs), 27 cellular components (CCs), and 21 molecular functions (MFs). Furthermore, protein-protein interaction (PPI) network analysis showed that PECAM1, ENG, KLF4, CDH5, and VWF were key genes. Survival analysis indicated that the low expression of ENG was associated with poor overall survival (OS) of LUAD patients. The low expression of PECAM1 was associated with poor OS and recurrence-free survival of LUAD patients. The cox regression model developed based on age, tumor stage, ENG, PECAM1 could effectively predict 5-year survival of LUAD patients. This study revealed some key genes, BPs, CCs, and MFs involved in LUAD, which would provide new insights into understanding the pathogenesis of LUAD. In addition, ENG and PECAM1 might serve as promising prognostic markers in LUAD.


Sign in / Sign up

Export Citation Format

Share Document