scholarly journals Signatures of Discriminative Copy Number Aberrations in 31 Cancer Subtypes

2021 ◽  
Vol 12 ◽  
Author(s):  
Bo Gao ◽  
Michael Baudis

Copy number aberrations (CNA) are one of the most important classes of genomic mutations related to oncogenetic effects. In the past three decades, a vast amount of CNA data has been generated by molecular-cytogenetic and genome sequencing based methods. While this data has been instrumental in the identification of cancer-related genes and promoted research into the relation between CNA and histo-pathologically defined cancer types, the heterogeneity of source data and derived CNV profiles pose great challenges for data integration and comparative analysis. Furthermore, a majority of existing studies have been focused on the association of CNA to pre-selected “driver” genes with limited application to rare drivers and other genomic elements. In this study, we developed a bioinformatics pipeline to integrate a collection of 44,988 high-quality CNA profiles of high diversity. Using a hybrid model of neural networks and attention algorithm, we generated the CNA signatures of 31 cancer subtypes, depicting the uniqueness of their respective CNA landscapes. Finally, we constructed a multi-label classifier to identify the cancer type and the organ of origin from copy number profiling data. The investigation of the signatures suggested common patterns, not only of physiologically related cancer types but also of clinico-pathologically distant cancer types such as different cancers originating from the neural crest. Further experiments of classification models confirmed the effectiveness of the signatures in distinguishing different cancer types and demonstrated their potential in tumor classification.

2020 ◽  
Author(s):  
Bo Gao ◽  
Michael Baudis

AbstractCopy number aberrations (CNA) are one of the most important classes of genomic mutations related to oncogenetic effects. In the past three decades, a vast amount of CNA data has been generated by molecular-cytogenetic and genome sequencing based methods. While this data has been instrumental in the identification of cancer-related genes and promoted research into the relation between CNA and histo-pathologically defined cancer types, the heterogeneity of source data and derived CNV profiles pose great challenges for data integration and comparative analysis. Furthermore, a majority of existing studies has been focused on the association of CNA to pre-selected “driver” genes with limited application to rare drivers and other genomic elements.In this study, we developed a bioinformatic pipeline to integrate a collection of 44,988 high-quality CNA profiles of high diversity. Using a hybrid model of neural networks and attention algorithm, we generated the CNA signatures of 31 cancer subtypes, depicting the uniqueness of their respective CNA landscapes. Finally, we constructed a multi-label classifier to identify the cancer type and the organ of origin from copy number profiling data. The investigation of the signatures suggested common patterns, not only of physiologically related cancer types but also of clinico-pathologically distant cancer types such as different cancers originating from the neural crest. Further experiments of classification models confirmed the effectiveness of the signatures in distinguishing different cancer types and demonstrated their potential in tumor classification.


2016 ◽  
Vol 14 (06) ◽  
pp. 1650031 ◽  
Author(s):  
Ana B. Pavel ◽  
Cristian I. Vasile

Cancer is a complex and heterogeneous genetic disease. Different mutations and dysregulated molecular mechanisms alter the pathways that lead to cell proliferation. In this paper, we explore a method which classifies genes into oncogenes (ONGs) and tumor suppressors. We optimize this method to identify specific (ONGs) and tumor suppressors for breast cancer, lung adenocarcinoma (LUAD), lung squamous cell carcinoma (LUSC) and colon adenocarcinoma (COAD), using data from the cancer genome atlas (TCGA). A set of genes were previously classified as ONGs and tumor suppressors across multiple cancer types (Science 2013). Each gene was assigned an ONG score and a tumor suppressor score based on the frequency of its driver mutations across all variants from the catalogue of somatic mutations in cancer (COSMIC). We evaluate and optimize this approach within different cancer types from TCGA. We are able to determine known driver genes for each of the four cancer types. After establishing the baseline parameters for each cancer type, we identify new driver genes for each cancer type, and the molecular pathways that are highly affected by them. Our methodology is general and can be applied to different cancer subtypes to identify specific driver genes and improve personalized therapy.


2022 ◽  
Author(s):  
Malvika Sudhakar ◽  
Raghunathan Rengaswamy ◽  
Karthik Raman

The progression of tumorigenesis starts with a few mutational and structural driver events in the cell. Various cohort-based computational tools exist to identify driver genes but require a large number of samples to produce reliable results. Many studies use different methods to identify driver mutations/genes from mutations that have no impact on tumour progression; however, a small fraction of patients show no mutational events in any known driver genes. Current unsupervised methods map somatic and expression data onto a network to identify the perturbation in the network. Our method is the first machine learning model to classify genes as tumour suppressor gene (TSG), oncogene (OG) or neutral, thus assigning the functional impact of the gene in the patient. In this study, we develop a multi-omic approach, PIVOT (Personalised Identification of driVer OGs and TSGs), to train on experimentally or computationally validated mutational and structural driver events. Given the lack of any gold standards for the identification of personalised driver genes, we label the data using four strategies and, based on classification metrics, show gene-based labelling strategies perform best. We build different models using SNV, RNA, and multi-omic features to be used based on the data available. Our models trained on multi-omic data improved predictions compared to mutation and expression data, achieving an accuracy >0.99 for BRCA, LUAD and COAD datasets. We show network and expression-based features contribute the most to PIVOT. Our predictions on BRCA, COAD and LUAD cancer types reveal commonly altered genes such as TP53, and PIK3CA, which are predicted drivers for multiple cancer types. Along with known driver genes, our models also identify new driver genes such as PRKCA, SOX9 and PSMD4. Our multi-omic model labels both CNV and mutations with a more considerable contribution by CNV alterations. While predicting labels for genes mutated in multiple samples, we also label rare driver events occurring in as few as one sample. We also identify genes with dual roles within the same cancer type. Overall, PIVOT labels personalised driver genes as TSGs and OGs and also identifies rare driver genes. PIVOT is available at https://github.com/RamanLab/PIVOT.


2021 ◽  
Vol 39 (15_suppl) ◽  
pp. 3072-3072
Author(s):  
Habte Aragaw Yimer ◽  
Wai Hong Wilson Tang ◽  
Mohan K. Tummala ◽  
Spencer Shao ◽  
Gina G. Chung ◽  
...  

3072 Background: The Circulating Cell-free Genome Atlas study (CCGA; NCT02889978) previously demonstrated that a blood-based multi-cancer early detection (MCED) test utilizing cell-free DNA (cfDNA) sequencing in combination with machine learning could detect cancer signals across multiple cancer types and predict cancer signal origin. Cancer classes were defined within the CCGA study for sensitivity reporting. Separately, cancer types defined by the American Joint Committee on Cancer (AJCC) criteria, which outline unique staging requirements and reflect a distinct combination of anatomic site, histology and other biologic features, were assigned to each cancer participant using the same source data for primary site of origin and histologic type. Here, we report CCGA ‘cancer class’ designation and AJCC ‘cancer type’ assignment within the third and final CCGA3 validation substudy to better characterize the diversity of tumors across which a cancer signal could be detected with the MCED test that is nearing clinical availability. Methods: CCGA is a prospective, multicenter, case-control, observational study with longitudinal follow-up (overall population N = 15,254). Plasma cfDNA from evaluable samples was analyzed using a targeted methylation bisulfite sequencing assay and a machine learning approach, and test performance, including sensitivity, was assessed. For sensitivity reporting, CCGA cancer classes were assigned to cancer participants using a combination of the type of primary cancer reported by the site and tumor characteristics abstracted from the site pathology reports by GRAIL pathologists. Each cancer participant also was separately assigned an AJCC cancer type based on the same source data using AJCC staging manual (8th edition) classifications. Results: A total of 4077 participants comprised the independent validation set with confirmed status (cancer: n = 2823; non-cancer: n = 1254 with non-cancer status confirmed at year-one follow-up). Sensitivity was reported for 24 cancer classes (sample sizes ranged from 10 to 524 participants), as well as an “other” cancer class (59 participants). According to AJCC classification, the MCED test was found to detect cancer signals across 50+ AJCC cancer types, including some types not present in the training set; some cancer types had limited representation. Conclusions: This MCED test that is nearing clinical availability and was evaluated in the third CCGA substudy detected cancer signals across 50+ AJCC cancer types. Reporting CCGA cancer classes and AJCC cancer types demonstrates the ability of the MCED test to detect cancer signals across a set of diverse cancer types representing a wide range of biologic characteristics, including cancer types that the classifier has not been trained on, and supports its use on a population-wide scale. Clinical trial information: NCT02889978.


2017 ◽  
Vol 16 (3) ◽  
pp. e1226-e1227
Author(s):  
S. La Touche ◽  
C. Lemetre ◽  
M. Lambros ◽  
E. Stankiewicz ◽  
C. Ng ◽  
...  

Microarrays ◽  
2015 ◽  
Vol 4 (3) ◽  
pp. 339-369 ◽  
Author(s):  
Javier Arsuaga ◽  
Tyler Borrman ◽  
Raymond Cavalcante ◽  
Georgina Gonzalez ◽  
Catherine Park

2017 ◽  
Author(s):  
Yun-Ching Chen ◽  
Valer Gotea ◽  
Gennady Margolin ◽  
Laura Elnitski

AbstractRecent evidence shows that mutations in several driver genes can cause aberrant methylation patterns, a hallmark of cancer. In light of these findings, we hypothesized that the landscapes of tumor genomes and epigenomes are tightly interconnected. We measured this relationship using principal component analyses and methylation-mutation associations applied at the nucleotide level and with respect to genome-wide trends. We found a few mutated driver genes were associated with genome-wide patterns of aberrant hypomethylation or CpG island hypermethylation in specific cancer types. We identified associations between 737 mutated driver genes and site-specific methylation changes. Moreover, using these mutation-methylation associations, we were able to distinguish between two uterine and two thyroid cancer subtypes. The driver gene mutation-associated methylation differences between the thyroid cancer subtypes were linked to differential gene expression in JAK-STAT signaling, NADPH oxidation, and other cancer-related pathways. These results establish that driver-gene mutations are associated with methylation alterations capable of shaping regulatory network functions. In addition, the methodology presented here can be used to subdivide tumors into more homogeneous subsets corresponding to their underlying molecular characteristics, which could improve treatment efficacy.Author summaryMutations that alter the function of driver genes by changing DNA nucleotides have been recognized as a key player in cancer progression. Recent evidence showed that DNA methylation, a molecular signature that is used for controlling gene expression and that consists of cytosine residues with attached methyl groups in the context of CG dinucleotides, is also highly dysregulated in cancer and contributes to carcinogenesis. However, whether those methylation alterations correspond to mutated driver genes in cancer remains unclear. In this study, we analyzed 4,302 tumors from 18 cancer types and demonstrated that driver gene mutations are inherently connected with the aberrant DNA methylation landscape in cancer. We showed that those driver gene-associated methylation patterns can classify heterogeneous tumors in a cancer type into homogeneous subtypes and have the potential to influence the genes that contribute to tumor growth. This finding could help us to better understand the fundamental connection between driver gene mutations and DNA methylation alterations in cancer and to further improve the cancer treatment.


PLoS ONE ◽  
2014 ◽  
Vol 9 (12) ◽  
pp. e115835 ◽  
Author(s):  
Joeri Both ◽  
Oscar Krijgsman ◽  
Johannes Bras ◽  
Gerard R. Schaap ◽  
Frank Baas ◽  
...  

Cancers ◽  
2018 ◽  
Vol 10 (12) ◽  
pp. 475 ◽  
Author(s):  
Jihee Soh ◽  
Hyejin Cho ◽  
Chan-Hun Choi ◽  
Hyunju Lee

MicroRNAs (miRNAs) are key molecules that regulate biological processes such as cell proliferation, differentiation, and apoptosis in cancer. Somatic copy number alterations (SCNAs) are common genetic mutations that play essential roles in cancer development. Here, we investigated the association between miRNAs and SCNAs in cancer. We collected 2538 tumor samples for seven cancer types from The Cancer Genome Atlas. We found that 32−84% of miRNAs are in SCNA regions, with the rate depending on the cancer type. In these regions, we identified 80 SCNA-miRNAs whose expression was mainly associated with SCNAs in at least one cancer type and showed that these SCNA-miRNAs are related to cancer by survival analysis and literature searching. We also identified 58 SCNA-miRNAs common in the seven cancer types (CC-SCNA-miRNAs) and showed that these CC-SCNA-miRNAs are more likely to be related with protein and gene expression than other miRNAs. Furthermore, we experimentally validated the oncogenic role of miR-589. In conclusion, our results suggest that SCNA-miRNAs significantly alter biological processes related to cancer development, confirming the importance of SCNAs in non-coding regions in cancer.


Sign in / Sign up

Export Citation Format

Share Document