scholarly journals driveR: a novel method for prioritizing cancer driver genes using somatic genomics data

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Ege Ülgen ◽  
O. Uğur Sezerman

Abstract Background Cancer develops due to “driver” alterations. Numerous approaches exist for predicting cancer drivers from cohort-scale genomics data. However, methods for personalized analysis of driver genes are underdeveloped. In this study, we developed a novel personalized/batch analysis approach for driver gene prioritization utilizing somatic genomics data, called driveR. Results Combining genomics information and prior biological knowledge, driveR accurately prioritizes cancer driver genes via a multi-task learning model. Testing on 28 different datasets, this study demonstrates that driveR performs adequately, achieving a median AUC of 0.684 (range 0.651–0.861) on the 28 batch analysis test datasets, and a median AUC of 0.773 (range 0–1) on the 5157 personalized analysis test samples. Moreover, it outperforms existing approaches, achieving a significantly higher median AUC than all of MutSigCV (Wilcoxon rank-sum test p < 0.001), DriverNet (p < 0.001), OncodriveFML (p < 0.001) and MutPanning (p < 0.001) on batch analysis test datasets, and a significantly higher median AUC than DawnRank (p < 0.001) and PRODIGY (p < 0.001) on personalized analysis datasets. Conclusions This study demonstrates that the proposed method is an accurate and easy-to-utilize approach for prioritizing driver genes in cancer genomes in personalized or batch analyses. driveR is available on CRAN: https://cran.r-project.org/package=driveR.

2020 ◽  
Author(s):  
Ege Ülgen ◽  
O. Uğur Sezerman

AbstractCancer develops due to “driver” alterations. Numerous approaches exist for predicting cancer drivers from cohort-scale genomic data. However, methods for personalized analysis of driver genes are underdeveloped.In this study, we developed a novel personalized/batch analysis approach for driver gene prioritization utilizing somatic genomic data, called driveR. Combining genomic information and prior biological knowledge, driveR accurately prioritizes cancer driver genes via a multi-task learning model.Testing on 28 different datasets, this study demonstrates that driveR performs adequately, outperforms existing approaches, and is an accurate and easy-to-utilize approach for prioritizing driver genes in cancer genomes. driveR is available on CRAN: https://cran.r-project.org/package=driveR.


2015 ◽  
Author(s):  
Chengliang Dong ◽  
Hui Yang ◽  
Zeyu He ◽  
Xiaoming Liu ◽  
Kai Wang

All cancers arise as a result of the acquisition of somatic mutations that drive the disease progression. A number of computational tools have been developed to identify driver genes for a specific cancer from a group of cancer samples. However, it remains a challenge to identify driver mutations/genes for an individual patient and design drug therapies. We developed iCAGES, a novel statistical framework to rapidly analyze patient-specific cancer genomic data, prioritize personalized cancer driver events and predict personalized therapies. iCAGES includes three consecutive layers: the first layer integrates contributions from coding, non-coding and structural variations to infer driver variants. For coding mutations, we developed a radial support vector machine using manually curated mutations to predict their driver potential. The second layer identifies driver genes, by using information from the first layer and integrating prior biological knowledge on gene-gene and gene-phenotype networks. The third layer prioritizes personalized drug treatment, by classifying potential driver genes into different categories and querying drug-gene databases. Compared to currently available tools, iCAGES achieves better performance by correctly classifying point coding driver mutations (AUC=0.97, 95% CI: 0.97-0.97, significantly better than the second best tool with P=0.01) and genes (AUC=0.93, 95% CI: 0.93-0.94, significantly better than MutSigCV with P<1X10-15). We also illustrated two examples where iCAGES correctly nominated two targeted drugs for two advanced cancer patients with exceptional response, based on their somatic mutation profiles. iCAGES leverages personal genomic information and prior biological knowledge, effectively identifies cancer driver genes and predicts treatment strategies. iCAGES is available at http://icages.usc.edu.


2020 ◽  
Vol 11 (1) ◽  
Author(s):  
Antonio Colaprico ◽  
Catharina Olsen ◽  
Matthew H. Bailey ◽  
Gabriel J. Odom ◽  
Thilde Terkelsen ◽  
...  

AbstractCancer driver gene alterations influence cancer development, occurring in oncogenes, tumor suppressors, and dual role genes. Discovering dual role cancer genes is difficult because of their elusive context-dependent behavior. We define oncogenic mediators as genes controlling biological processes. With them, we classify cancer driver genes, unveiling their roles in cancer mechanisms. To this end, we present Moonlight, a tool that incorporates multiple -omics data to identify critical cancer driver genes. With Moonlight, we analyze 8000+ tumor samples from 18 cancer types, discovering 3310 oncogenic mediators, 151 having dual roles. By incorporating additional data (amplification, mutation, DNA methylation, chromatin accessibility), we reveal 1000+ cancer driver genes, corroborating known molecular mechanisms. Additionally, we confirm critical cancer driver genes by analysing cell-line datasets. We discover inactivation of tumor suppressors in intron regions and that tissue type and subtype indicate dual role status. These findings help explain tumor heterogeneity and could guide therapeutic decisions.


2020 ◽  
Vol 3 (1) ◽  
Author(s):  
Xiaobao Dong ◽  
Dandan Huang ◽  
Xianfu Yi ◽  
Shijie Zhang ◽  
Zhao Wang ◽  
...  

AbstractMutation-specific effects of cancer driver genes influence drug responses and the success of clinical trials. We reasoned that these effects could unbalance the distribution of each mutation across different cancer types, as a result, the cancer preference can be used to distinguish the effects of the causal mutation. Here, we developed a network-based framework to systematically measure cancer diversity for each driver mutation. We found that half of the driver genes harbor cancer type-specific and pancancer mutations simultaneously, suggesting that the pervasive functional heterogeneity of the mutations from even the same driver gene. We further demonstrated that the specificity of the mutations could influence patient drug responses. Moreover, we observed that diversity was generally increased in advanced tumors. Finally, we scanned potentially novel cancer driver genes based on the diversity spectrum. Diversity spectrum analysis provides a new approach to define driver mutations and optimize off-label clinical trials.


2018 ◽  
Author(s):  
Siming Zhao ◽  
Jun Liu ◽  
Pranav Nanga ◽  
Yuwen Liu ◽  
A. Ercument Cicek ◽  
...  

AbstractIdentifying driver genes is a central problem in cancer biology, and many methods have been developed to identify driver genes from somatic mutation data. However, existing methods either lack explicit statistical models, or rely on very simple models that do not capture complex features in somatic mutations of driver genes. Here, we present driverMAPS (Model-based Analysis of Positive Selection), a more comprehensive model-based approach to driver gene identification. This new method explicitly models, at the single-base level, the effects of positive selection in cancer driver genes as well as highly heterogeneous background mutational process. Its selection model captures elevated mutation rates in functionally important sites using multiple external annotations, as well as spatial clustering of mutations. Its background mutation model accounts for both known covariates and unexplained local variation. Simulations under realistic evolutionary models demonstrate that driverMAPS greatly improves the power of driver gene detection over state-of-the-art approaches. Applying driverMAPS to TCGA data across 20 tumor types identified 159 new potential driver genes. Cross-referencing this list with data from external sources strongly supports these findings. The novel genes include the mRNA methytransferases METTL3-METTL14, and we experimentally validated METTL3 as a potential tumor suppressor gene in bladder cancer. Our results thus provide strong support to the emerging hypothesis that mRNA modification is an important biological process underlying tumorigenesis.


2020 ◽  
Author(s):  
Vu VH Pham ◽  
Lin Liu ◽  
Cameron P Bracken ◽  
Gregory J Goodall ◽  
Jiuyong Li ◽  
...  

AbstractMotivationIdentifying cancer driver genes is a key task in cancer informatics. Most exisiting methods are focused on individual cancer drivers which regulate biological processes leading to cancer. However, the effect of a single gene may not be sufficient to drive cancer progression. Here, we hypothesise that there are driver gene groups that work in concert to regulate cancer and we develop a novel computational method to detect those driver gene groups.ResultsWe develop a novel method named DriverGroup to detect driver gene groups by using gene expression and gene interaction data. The proposed method has three stages: (1) Constructing the gene network, (2) Discovering critical nodes of the constructed network, and (3) Identifying driver gene groups based on the discovered critical nodes. Before evaluating the performance of DriverGroup in detecting cancer driver groups, we firstly assess its performance in detecting the influence of gene groups, a key step of DriverGroup. The application of DriverGroup to DREAM4 data demonstrates that it is more effective than other methods in detecting the regulation of gene groups. We then apply DriverGroup to the BRCA dataset to identify coding and non-coding driver groups for breast cancer. The identified driver groups are promising as several group members are confirmed to be related to cancer in literature. We further use the predicted driver groups in survival analysis and the results show that the survival curves of patient subpopulations classified using the predicted driver groups are significantly differentiated, indicating the usefulness of DriverGroup.Availability and implementationDriverGroup is available at https://github.com/pvvhoang/[email protected] informationSupplementary data are available at Bioinformatics online.


2020 ◽  
Vol 36 (Supplement_2) ◽  
pp. i583-i591
Author(s):  
Vu V H Pham ◽  
Lin Liu ◽  
Cameron P Bracken ◽  
Gregory J Goodall ◽  
Jiuyong Li ◽  
...  

Abstract Motivation Identifying cancer driver genes is a key task in cancer informatics. Most existing methods are focused on individual cancer drivers which regulate biological processes leading to cancer. However, the effect of a single gene may not be sufficient to drive cancer progression. Here, we hypothesize that there are driver gene groups that work in concert to regulate cancer, and we develop a novel computational method to detect those driver gene groups. Results We develop a novel method named DriverGroup to detect driver gene groups by using gene expression and gene interaction data. The proposed method has three stages: (i) constructing the gene network, (ii) discovering critical nodes of the constructed network and (iii) identifying driver gene groups based on the discovered critical nodes. Before evaluating the performance of DriverGroup in detecting cancer driver groups, we firstly assess its performance in detecting the influence of gene groups, a key step of DriverGroup. The application of DriverGroup to DREAM4 data demonstrates that it is more effective than other methods in detecting the regulation of gene groups. We then apply DriverGroup to the BRCA dataset to identify driver groups for breast cancer. The identified driver groups are promising as several group members are confirmed to be related to cancer in literature. We further use the predicted driver groups in survival analysis and the results show that the survival curves of patient subpopulations classified using the predicted driver groups are significantly differentiated, indicating the usefulness of DriverGroup. Availability and implementation DriverGroup is available at https://github.com/pvvhoang/DriverGroup Supplementary information Supplementary data are available at Bioinformatics online.


2017 ◽  
Author(s):  
Magali Champion ◽  
Kevin Brennan ◽  
Tom Croonenborghs ◽  
Andrew J. Gentles ◽  
Nathalie Pochet ◽  
...  

AbstractThe availability of increasing volumes of multi-omics profiles across many cancers promises to improve our understanding of the regulatory mechanisms underlying cancer. The main challenge is to integrate these multiple levels of omics profiles and especially to analyze them across many cancers. Here we present AMARETTO, an algorithm that addresses both challenges in three steps. First, AMARETTO identifies potential cancer driver genes through integration of copy number, DNA methylation and gene expression data. Then AMARETTO connects these driver genes with co-expressed target genes that they control, defined as regulatory modules. Thirdly, we connect AMARETTO modules identified from different cancer sites into a pancancer network to identify cancer driver genes. Here we applied AMARETTO in a pancancer study comprising eleven cancer sites and confirmed that AMARETTO captures hallmarks of cancer. We also demonstrated that AMARETTO enables the identification of novel pancancer driver genes. In particular, our analysis led to the identification of pancancer driver genes of smoking-induced cancers and ‘antiviral’ interferon-modulated innate immune response.Software availabilityAMARETTO is available as an R package athttps://bitbucket.org/gevaertlab/pancanceramarettoHighlightsWe present an algorithm for pancancer identification of cancer driver genes based on multiomics data fusionGPX2 is a novel driver gene in smoking induced cancers and validated using knockdown of GPX2 in the A549 cell line.OAS2 is a novel driver gene defining cancers with an antiviral signature supported by increased infiltration of tumor-associated macrophages.Research in contextWe present an algorithm that combines multiple sources of molecular data to identify novel genes that are involved in cancer development. We applied this algorithm on multiple cancers in a combined fashion and identified a network of pancancer driver genes. We highlighted two genes in detail GPX2 and OAS2. We showed that GPX2 is an important cancer gene in smoking induced cancers, and validated our predictions using experimental data where GPX2 was inactivated in a lung cancer cell line. Similarly we showed that OAS2 is an important cancer driver gene in cancers that show an antiviral signature.


2021 ◽  
Vol 19 (1) ◽  
Author(s):  
Xiao-wei Du ◽  
Gao Li ◽  
Juan Liu ◽  
Chun-yan Zhang ◽  
Qiong Liu ◽  
...  

Abstract Background Breast cancer is the most common malignancy in women. Cancer driver gene-mediated alterations in the tumor microenvironment are critical factors affecting the biological behavior of breast cancer. The purpose of this study was to identify the expression characteristics and prognostic value of cancer driver genes in breast cancer. Methods The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) datasets are used as the training and test sets. Classified according to cancer and paracancerous tissues, we identified differentially expressed cancer driver genes. We further screened prognosis-associated genes, and candidate genes were submitted for the construction of a risk signature. Functional enrichment analysis and transcriptional regulatory networks were performed to search for possible mechanisms by which cancer driver genes affect breast cancer prognosis. Results We identified more than 200 differentially expressed driver genes and 27 prognosis-related genes. High-risk group patients had a lower survival rate compared to the low-risk group (P<0.05), and risk signature showed high specificity and sensitivity in predicting the patient prognosis (AUC 0.790). Multivariate regression analysis suggested that risk scores can independently predict patient prognosis. Further, we found differences in PD-1 expression, immune score, and stromal score among different risk groups. Conclusion Our study confirms the critical prognosis role of cancer driver genes in breast cancer. The cancer driver gene risk signature may provide a novel biomarker for clinical treatment strategy and survival prediction of breast cancer.


2016 ◽  
Vol 113 (50) ◽  
pp. 14330-14335 ◽  
Author(s):  
Collin J. Tokheim ◽  
Nickolas Papadopoulos ◽  
Kenneth W. Kinzler ◽  
Bert Vogelstein ◽  
Rachel Karchin

Sequencing has identified millions of somatic mutations in human cancers, but distinguishing cancer driver genes remains a major challenge. Numerous methods have been developed to identify driver genes, but evaluation of the performance of these methods is hindered by the lack of a gold standard, that is, bona fide driver gene mutations. Here, we establish an evaluation framework that can be applied to driver gene prediction methods. We used this framework to compare the performance of eight such methods. One of these methods, described here, incorporated a machine-learning–based ratiometric approach. We show that the driver genes predicted by each of the eight methods vary widely. Moreover, the P values reported by several of the methods were inconsistent with the uniform values expected, thus calling into question the assumptions that were used to generate them. Finally, we evaluated the potential effects of unexplained variability in mutation rates on false-positive driver gene predictions. Our analysis points to the strengths and weaknesses of each of the currently available methods and offers guidance for improving them in the future.


Sign in / Sign up

Export Citation Format

Share Document