scholarly journals A CATH domain functional family based approach to identify putative cancer driver genes and driver mutations

2018 ◽  
Author(s):  
Paul Ashford ◽  
Camilla S.M. Pang ◽  
Aurelio A. Moya-García ◽  
Tolulope Adeyelu ◽  
Christine A. Orengo

Tumour sequencing identifies highly recurrent point mutations in cancer driver genes, but rare functional mutations are hard to distinguish from large numbers of passengers. We developed a novel computational platform applying a multi-modal approach to filter out passengers and more robustly identify putative driver genes. The primary filter identifies enrichment of cancer mutations in CATH functional families (CATH-FunFams) – structurally and functionally coherent sets of evolutionary related domains. Using structural representatives from CATH-FunFams, we subsequently seek enrichment of mutations in 3D and show that these mutation clusters have a very significant tendency to lie close to known functional sites or conserved sites predicted using CATH-FunFams. Our third filter identifies enrichment of putative driver genes in functionally coherent protein network modules confirmed by literature analysis to be cancer associated.Our approach is complementary to other domain enrichment approaches exploiting Pfam families, but benefits from more functionally coherent groupings of domains. Using a set of mutations from 22 cancers we detect 151 putative cancer drivers, of which 79 are not listed in cancer resources and include recently validated cancer genes EPHA7, DCC netrin-1 receptor and zinc-finger protein ZNF479.

2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Paul Ashford ◽  
Camilla S. M. Pang ◽  
Aurelio A. Moya-García ◽  
Tolulope Adeyelu ◽  
Christine A. Orengo

2018 ◽  
Author(s):  
Felix Dietlein ◽  
Donate Weghorn ◽  
Amaro Taylor-Weiner ◽  
André Richters ◽  
Brendan Reardon ◽  
...  

Many cancer genomes contain large numbers of somatic mutations, but few of these mutations drive tumor development. Current approaches to identify cancer driver genes are largely based on mutational recurrence, i.e. they search for genes with an increased number of nonsynonymous mutations relative to the local background mutation rate. Multiple studies have noted that the sensitivity of recurrence-based methods is limited in tumors with high background mutation rates, because passenger mutations dilute their statistical power. Here, we observe that passenger mutations tend to occur in characteristic nucleotide sequence contexts, while driver mutations follow a different distribution pattern determined by the location of functionally relevant genomic positions along the protein-coding sequence. To discover new cancer genes, we searched for genes with an excess of mutations in unusual nucleotide contexts that deviate from the characteristic context around passenger mutations. By applying this statistical framework to whole-exome sequencing data from 12,004 tumors, we discovered a long tail of novel candidate cancer genes with mutation frequencies as low as 1% and functional supporting evidence. Our results show that considering both the number and the nucleotide context around mutations helps identify novel cancer driver genes, particularly in tumors with high background mutation rates.


2020 ◽  
Vol 49 (D1) ◽  
pp. D1289-D1301 ◽  
Author(s):  
Tao Wang ◽  
Shasha Ruan ◽  
Xiaolu Zhao ◽  
Xiaohui Shi ◽  
Huajing Teng ◽  
...  

Abstract The prevalence of neutral mutations in cancer cell population impedes the distinguishing of cancer-causing driver mutations from passenger mutations. To systematically prioritize the oncogenic ability of somatic mutations and cancer genes, we constructed a useful platform, OncoVar (https://oncovar.org/), which employed published bioinformatics algorithms and incorporated known driver events to identify driver mutations and driver genes. We identified 20 162 cancer driver mutations, 814 driver genes and 2360 pathogenic pathways with high-confidence by reanalyzing 10 769 exomes from 33 cancer types in The Cancer Genome Atlas (TCGA) and 1942 genomes from 18 cancer types in International Cancer Genome Consortium (ICGC). OncoVar provides four points of view, ‘Mutation’, ‘Gene’, ‘Pathway’ and ‘Cancer’, to help researchers to visualize the relationships between cancers and driver variants. Importantly, identification of actionable driver alterations provides promising druggable targets and repurposing opportunities of combinational therapies. OncoVar provides a user-friendly interface for browsing, searching and downloading somatic driver mutations, driver genes and pathogenic pathways in various cancer types. This platform will facilitate the identification of cancer drivers across individual cancer cohorts and helps to rank mutations or genes for better decision-making among clinical oncologists, cancer researchers and the broad scientific community interested in cancer precision medicine.


2020 ◽  
Vol 11 (1) ◽  
Author(s):  
Antonio Colaprico ◽  
Catharina Olsen ◽  
Matthew H. Bailey ◽  
Gabriel J. Odom ◽  
Thilde Terkelsen ◽  
...  

AbstractCancer driver gene alterations influence cancer development, occurring in oncogenes, tumor suppressors, and dual role genes. Discovering dual role cancer genes is difficult because of their elusive context-dependent behavior. We define oncogenic mediators as genes controlling biological processes. With them, we classify cancer driver genes, unveiling their roles in cancer mechanisms. To this end, we present Moonlight, a tool that incorporates multiple -omics data to identify critical cancer driver genes. With Moonlight, we analyze 8000+ tumor samples from 18 cancer types, discovering 3310 oncogenic mediators, 151 having dual roles. By incorporating additional data (amplification, mutation, DNA methylation, chromatin accessibility), we reveal 1000+ cancer driver genes, corroborating known molecular mechanisms. Additionally, we confirm critical cancer driver genes by analysing cell-line datasets. We discover inactivation of tumor suppressors in intron regions and that tissue type and subtype indicate dual role status. These findings help explain tumor heterogeneity and could guide therapeutic decisions.


2020 ◽  
Vol 3 (1) ◽  
Author(s):  
Xiaobao Dong ◽  
Dandan Huang ◽  
Xianfu Yi ◽  
Shijie Zhang ◽  
Zhao Wang ◽  
...  

AbstractMutation-specific effects of cancer driver genes influence drug responses and the success of clinical trials. We reasoned that these effects could unbalance the distribution of each mutation across different cancer types, as a result, the cancer preference can be used to distinguish the effects of the causal mutation. Here, we developed a network-based framework to systematically measure cancer diversity for each driver mutation. We found that half of the driver genes harbor cancer type-specific and pancancer mutations simultaneously, suggesting that the pervasive functional heterogeneity of the mutations from even the same driver gene. We further demonstrated that the specificity of the mutations could influence patient drug responses. Moreover, we observed that diversity was generally increased in advanced tumors. Finally, we scanned potentially novel cancer driver genes based on the diversity spectrum. Diversity spectrum analysis provides a new approach to define driver mutations and optimize off-label clinical trials.


2018 ◽  
Author(s):  
Lin Jiang ◽  
Jingjing Zheng ◽  
Johnny Sheung Him Kwan ◽  
Sheng Dai ◽  
Cong Li ◽  
...  

AbstractGenomic identification of driver mutations and genes in cancer cells are critical for precision medicine. Due to difficulty in modeling distribution of background mutations, existing statistical methods are often underpowered to discriminate driver genes from passenger genes. Here we propose a novel statistical approach, weighted iterative zero-truncated negative-binomial regression (WITER), to detect cancer-driver genes showing an excess of somatic mutations. By solving the problem of inaccurately modeling background mutations, this approach works even in small or moderate samples. Compared to alternative methods, it detected more significant and cancer-consensus genes in all tested cancers. Applying this approach, we estimated 178 driver genes in 26 different cancers types. In silico validation confirmed 90.5% of predicted genes as likely known drivers and 7 genes unique for individual cancers as likely new drivers. The technical advances of WITER enable the detection of driver genes in TCGA datasets as small as 30 subjects, rescuing more genes missed by alternative tools.


2019 ◽  
Vol 47 (16) ◽  
pp. e96-e96 ◽  
Author(s):  
Lin Jiang ◽  
Jingjing Zheng ◽  
Johnny S H Kwan ◽  
Sheng Dai ◽  
Cong Li ◽  
...  

Abstract Genomic identification of driver mutations and genes in cancer cells are critical for precision medicine. Due to difficulty in modelling distribution of background mutation counts, existing statistical methods are often underpowered to discriminate cancer-driver genes from passenger genes. Here we propose a novel statistical approach, weighted iterative zero-truncated negative-binomial regression (WITER, http://grass.cgs.hku.hk/limx/witer or KGGSeq,http://grass.cgs.hku.hk/limx/kggseq/), to detect cancer-driver genes showing an excess of somatic mutations. By fitting the distribution of background mutation counts properly, this approach works well even in small or moderate samples. Compared to alternative methods, it detected more significant and cancer-consensus genes in most tested cancers. Applying this approach, we estimated 229 driver genes in 26 different types of cancers. In silico validation confirmed 78% of predicted genes as likely known drivers and many other genes as very likely new drivers for corresponding cancers. The technical advances of WITER enable the detection of driver genes in TCGA datasets as small as 30 subjects and rescue of more genes missed by alternative tools in moderate or small samples.


2020 ◽  
Author(s):  
Ferran Muiños ◽  
Francisco Martinez-Jimenez ◽  
Oriol Pich ◽  
Abel Gonzalez-Perez ◽  
Nuria Lopez-Bigas

SummaryExtensive bioinformatics analysis of datasets of tumor somatic mutations data have revealed the presence of some 500-600 cancer driver genes. The identification of all potential driver mutations affecting cancer genes is essential to implement precision cancer medicine and to understand the interplay of mutation probability and selection in tumor development. Here, we present an in silico saturation mutagenesis approach to identify all driver mutations in 568 cancer genes across 66 tumor types. For most cancer genes the mutation probability across tissues --underpinned by active mutational processes-- influences which driver variants have been observed, although this differs significantly between tumor suppressor and oncogenes. The role of selection is apparent in some of the latter, the observed and unobserved driver mutations of which are equally likely to occur. The number of potential driver mutations in a cancer gene roughly determines how many mutations are available for detection across newly sequenced tumors.


2018 ◽  
Author(s):  
Olivier Collier ◽  
Véronique Stoven ◽  
Jean-Philippe Vert

AbstractCancer driver genes, i.e., oncogenes and tumor suppressor genes, are involved in the acquisition of important functions in tumors, providing a selective growth advantage, allowing uncontrolled proliferation and avoiding apoptosis. It is therefore important to identify these driver genes, both for the fundamental understanding of cancer and to help finding new therapeutic targets. Although the most frequently mutated driver genes have been identified, it is believed that many more remain to be discovered, particularly for driver genes specific to some cancer types.In this paper we propose a new computational method called LOTUS to predict new driver genes. LOTUS is a machine-learning based approach which allows to integrate various types of data in a versatile manner, including informations about gene mutations and protein-protein interactions. In addition, LOTUS can predict cancer driver genes in a pan-cancer setting as well as for specific cancer types, using a multitask learning strategy to share information across cancer types.We empirically show that LOTUS outperforms three other state-of-the-art driver gene prediction methods, both in terms of intrinsic consistency and prediction accuracy, and provide predictions of new cancer genes across many cancer types.Author summaryCancer development is driven by mutations and dysfunction of important, so-called cancer driver genes, that could be targeted by targeted therapies. While a number of such cancer genes have already been identified, it is believed that many more remain to be discovered. To help prioritize experimental investigations of candidate genes, several computational methods have been proposed to rank promising candidates based on their mutations in large cohorts of cancer cases, or on their interactions with known driver genes in biological networks. We propose LOTUS, a new computational approach to identify genes with high oncogenic potential. LOTUS implements a machine learning approach to learn an oncogenic potential score from known driver genes, and brings two novelties compared to existing methods. First, it allows to easily combine heterogeneous informations into the scoring function, which we illustrate by learning a scoring function from both known mutations in large cancer cohorts and interactions in biological networks. Second, using a multitask learning strategy, it can predict different driver genes for different cancer types, while sharing information between them to improve the prediction for every type. We provide experimental results showing that LOTUS significantly outperforms several state-of-the-art cancer gene prediction softwares.


2019 ◽  
Vol 12 (S7) ◽  
Author(s):  
Hua Zhong ◽  
Mingzhou Song

Abstract Background Most statistical methods used to identify cancer driver genes are either biased due to choice of assumed parametric models or insensitive to directional relationships important for causal inference. To overcome modeling biases and directional insensitivity, a recent statistical functional chi-squared test (FunChisq) detects directional association via model-free functional dependency. FunChisq examines patterns pointing from independent to dependent variables arising from linear, non-linear, or many-to-one functional relationships. Meanwhile, the Functional Annotation of Mammalian Genome 5 (FANTOM5) project surveyed gene expression at over 200,000 transcription start sites (TSSs) in nearly all human tissue types, primary cell types, and cancer cell lines. The data cover TSSs originated from both coding and noncoding genes. For the vast uncharacterized human TSSs that may exhibit complex patterns in cancer versus normal tissues, the model-free property of FunChisq provides us an unprecedented opportunity to assess the evidence for a gene’s directional effect on human cancer. Results We first evaluated FunChisq and six other methods using 719 curated cancer genes on the FANTOM5 data. FunChisq performed best in detecting known cancer driver genes from non-cancer genes. We also show the capacity of FunChisq to reveal non-monotonic patterns of functional association, to which typical differential analysis methods such as t-test are insensitive. Further applying FunChisq to screen unannotated TSSs in FANTOM5, we predicted 1108 putative cancer driver noncoding RNAs, stronger than 90% of curated cancer driver genes. Next, we compared leukemia samples against other samples in FANTOM5 and FunChisq predicted 332/79 potential biomarkers for lymphoid/myeloid leukemia, stronger than the TSSs of all 87/100 known driver genes in lymphoid/myeloid leukemia. Conclusions This study demonstrated the advantage of FunChisq in revealing directional association, especially in detecting non-monotonic patterns. Here, we also provide the most comprehensive catalog of high-quality biomarkers that may play a causative role in human cancers, including putative cancer driver noncoding RNAs and lymphoid/myeloid leukemia specific biomarkers.


Sign in / Sign up

Export Citation Format

Share Document