Recent selection is a major force driving cancer evolution

Mapping Intimacies ◽

10.1101/2021.12.27.474305 ◽

2021 ◽

Author(s):

Langyu Gu ◽

Guofen Yang

Keyword(s):

Asian Population ◽

Incidence Rates ◽

Human Populations ◽

Cancer Evolution ◽

South Asian Population ◽

Driver Genes ◽

Cancer Driver ◽

Cancer Types ◽

Cancer Driver Genes ◽

Recent Selection

Cancer is one of the most threatening diseases to humans. Understanding the evolution of cancer genes is helpful for therapy management. However, systematic investigation of the evolution of cancer driver genes is sparse. Using comparative genomic analysis, population genetics analysis and computational molecular evolutionary analysis, we detected the evolution of 568 cancer driver genes of 66 cancer types across the primate phylogeny (long timescale selection), and in modern human populations from the 1000 human genomics project (recent selection). We found that recent selection pressures, rather than long timescale selection, significantly affect the evolution of cancer driver genes in humans. Cancer driver genes related to morphological traits and local adaptation are under positive selection in different human populations. The African population showed the largest extent of divergence compared to other populations. It is worth noting that the corresponding cancer types of positively selected genes exhibited population-specific patterns, with the South Asian population possessing the least numbers of cancer types. This helps explain why the South Asian population usually has low cancer incidence rates. Population-specific patterns of cancer types whose driver genes are under positive selection also give clues to explain discrepancies of cancer incidence rates in different geographical populations, such as the high incidence rate of Wilms tumour in the African population and of Ewing's sarcomas in the European population. Our findings are thus helpful for understanding cancer evolution and providing guidance for further precision medicine.

Download Full-text

DriverRWH: Discovering Cancer Driver Genes By Random Walk On a Gene Mutation Hypergraph

10.21203/rs.3.rs-1192205/v1 ◽

2021 ◽

Author(s):

Chenye Wang ◽

Junhan Shi ◽

Jiansheng Cai ◽

Yusen Zhang ◽

Xiaoqi Zheng ◽

...

Keyword(s):

Random Walk ◽

Candidate Genes ◽

Gene Mutation ◽

Network Data ◽

Cumulative Number ◽

Driver Genes ◽

Cancer Driver ◽

Cancer Types ◽

Mutation Data ◽

Cancer Driver Genes

Abstract Background: Recent advances in next-generation sequencing technologies have helped investigators generate massive amounts of cancer genomic data. A critical challenge in cancer genomics is identification of a few driver mutation genes from a much larger number of passenger mutation genes. However, majority of existing computational approaches underuse the co-occurrence information of the individuals, which deems to be important in tumorigenesis and tumor progression. Driver gene list predicted from these tools are prone to be false positive, recent research is far from achieving the ultimate goal of discovering a complete catalog of driver genes. Results: To make full use of co-mutation information, we present a random walk algorithm referred to as DriverRWH on a weighted gene mutation hypergraph model, using somatic mutation data and molecular interaction network data to prioritize candidate driver genes. Applied to tumor samples of different cancer types from The Cancer Genome Atlas (TCGA), DriverRWH shows significantly better performance than state-of-art prioritization methods in terms of the area under the curve (AUC) scores and the cumulative number of known driver genes recovered in top-ranked candidate genes. DriverRWH recovers approximately 50% known driver genes in the top 30 ranked candidate genes for more than half of the cancer types. In addition, DriverRWH is also highly robust to perturbations in the mutation data and gene functional network data. Conclusion: DriverRWH is effective among various cancer types in prioritizes cancer driver genes and provides considerable improvement over other tools with a better balance of precision and sensitivity. It can be a useful tool for detecting potential driver genes and facilitate targeted cancer therapies.

Download Full-text

Contextual Classifications of Cancer Driver Genes

10.1101/715508 ◽

2019 ◽

Author(s):

Pramod Chandrashekar ◽

Navid Ahmadinejad ◽

Junwen Wang ◽

Aleksandar Sekulic ◽

Jan B. Egan ◽

...

Keyword(s):

Computational Method ◽

Cancer Type ◽

Sequencing Data ◽

Multiple Cancer ◽

Driver Genes ◽

Cancer Driver ◽

Link Type ◽

Mutational Hotspots ◽

Cancer Types ◽

Cancer Driver Genes

ABSTRACTFunctions of cancer driver genes depend on cellular contexts that vary substantially across tissues and organs. Distinguishing oncogenes (OGs) and tumor suppressor genes (TSGs) for each cancer type is critical to identifying clinically actionable targets. However, current resources for context-aware classifications of cancer drivers are limited. In this study, we show that the direction and magnitude of somatic selection of missense and truncating mutations of a gene are suggestive of its contextual activities. By integrating these features with ratiometric and conservation measures, we developed a computational method to categorize OGs and TSGs using exome sequencing data. This new method, named genes under selection in tumors (GUST) shows an overall accuracy of 0.94 when tested on manually curated benchmarks. Application of GUST to 10,172 tumor exomes of 33 cancer types identified 98 OGs and 179 TSGs, >70% of which promote tumorigenesis in only one cancer type. In broad-spectrum drivers shared across multiple cancer types, we found heterogeneous mutational hotspots modifying distinct functional domains, implicating the synchrony of convergent and divergent disease mechanisms. We further discovered two novel OGs and 28 novel TSGs with high confidence. The GUST program is available at https://github.com/liliulab/gust. A database with pre-computed classifications is available at https://liliulab.shinyapps.io/gust

Download Full-text

LOTUS: a Single- and Multitask Machine Learning Algorithm for the Prediction of Cancer Driver Genes

10.1101/398537 ◽

2018 ◽

Cited By ~ 1

Author(s):

Olivier Collier ◽

Véronique Stoven ◽

Jean-Philippe Vert

Keyword(s):

Machine Learning ◽

Biological Networks ◽

Learning Strategy ◽

Gene Prediction ◽

Scoring Function ◽

Cancer Genes ◽

Driver Genes ◽

Cancer Driver ◽

Cancer Types ◽

Cancer Driver Genes

AbstractCancer driver genes, i.e., oncogenes and tumor suppressor genes, are involved in the acquisition of important functions in tumors, providing a selective growth advantage, allowing uncontrolled proliferation and avoiding apoptosis. It is therefore important to identify these driver genes, both for the fundamental understanding of cancer and to help finding new therapeutic targets. Although the most frequently mutated driver genes have been identified, it is believed that many more remain to be discovered, particularly for driver genes specific to some cancer types.In this paper we propose a new computational method called LOTUS to predict new driver genes. LOTUS is a machine-learning based approach which allows to integrate various types of data in a versatile manner, including informations about gene mutations and protein-protein interactions. In addition, LOTUS can predict cancer driver genes in a pan-cancer setting as well as for specific cancer types, using a multitask learning strategy to share information across cancer types.We empirically show that LOTUS outperforms three other state-of-the-art driver gene prediction methods, both in terms of intrinsic consistency and prediction accuracy, and provide predictions of new cancer genes across many cancer types.Author summaryCancer development is driven by mutations and dysfunction of important, so-called cancer driver genes, that could be targeted by targeted therapies. While a number of such cancer genes have already been identified, it is believed that many more remain to be discovered. To help prioritize experimental investigations of candidate genes, several computational methods have been proposed to rank promising candidates based on their mutations in large cohorts of cancer cases, or on their interactions with known driver genes in biological networks. We propose LOTUS, a new computational approach to identify genes with high oncogenic potential. LOTUS implements a machine learning approach to learn an oncogenic potential score from known driver genes, and brings two novelties compared to existing methods. First, it allows to easily combine heterogeneous informations into the scoring function, which we illustrate by learning a scoring function from both known mutations in large cancer cohorts and interactions in biological networks. Second, using a multitask learning strategy, it can predict different driver genes for different cancer types, while sharing information between them to improve the prediction for every type. We provide experimental results showing that LOTUS significantly outperforms several state-of-the-art cancer gene prediction softwares.

Download Full-text

Faculty Opinions recommendation of Evaluating the evaluation of cancer driver genes.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.727060594.793535346 ◽

2017 ◽

Author(s):

Ron Shamir

Keyword(s):

Driver Genes ◽

Cancer Driver ◽

Cancer Driver Genes

Download Full-text

driveR: a novel method for prioritizing cancer driver genes using somatic genomics data

BMC Bioinformatics ◽

10.1186/s12859-021-04203-7 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Ege Ülgen ◽

O. Uğur Sezerman

Keyword(s):

Biological Knowledge ◽

Driver Gene ◽

Driver Genes ◽

Cancer Driver ◽

Prior Biological Knowledge ◽

Wilcoxon Rank Sum Test ◽

Cancer Genomes ◽

Novel Method ◽

Cancer Driver Genes ◽

Batch Analysis

Abstract Background Cancer develops due to “driver” alterations. Numerous approaches exist for predicting cancer drivers from cohort-scale genomics data. However, methods for personalized analysis of driver genes are underdeveloped. In this study, we developed a novel personalized/batch analysis approach for driver gene prioritization utilizing somatic genomics data, called driveR. Results Combining genomics information and prior biological knowledge, driveR accurately prioritizes cancer driver genes via a multi-task learning model. Testing on 28 different datasets, this study demonstrates that driveR performs adequately, achieving a median AUC of 0.684 (range 0.651–0.861) on the 28 batch analysis test datasets, and a median AUC of 0.773 (range 0–1) on the 5157 personalized analysis test samples. Moreover, it outperforms existing approaches, achieving a significantly higher median AUC than all of MutSigCV (Wilcoxon rank-sum test p < 0.001), DriverNet (p < 0.001), OncodriveFML (p < 0.001) and MutPanning (p < 0.001) on batch analysis test datasets, and a significantly higher median AUC than DawnRank (p < 0.001) and PRODIGY (p < 0.001) on personalized analysis datasets. Conclusions This study demonstrates that the proposed method is an accurate and easy-to-utilize approach for prioritizing driver genes in cancer genomes in personalized or batch analyses. driveR is available on CRAN: https://cran.r-project.org/package=driveR.

Download Full-text