scholarly journals Ranking cancer drivers via betweenness-based outlier detection and random walks

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Cesim Erten ◽  
Aissa Houdjedj ◽  
Hilal Kazan

Abstract Background Recent cancer genomic studies have generated detailed molecular data on a large number of cancer patients. A key remaining problem in cancer genomics is the identification of driver genes. Results We propose BetweenNet, a computational approach that integrates genomic data with a protein-protein interaction network to identify cancer driver genes. BetweenNet utilizes a measure based on betweenness centrality on patient specific networks to identify the so-called outlier genes that correspond to dysregulated genes for each patient. Setting up the relationship between the mutated genes and the outliers through a bipartite graph, it employs a random-walk process on the graph, which provides the final prioritization of the mutated genes. We compare BetweenNet against state-of-the art cancer gene prioritization methods on lung, breast, and pan-cancer datasets. Conclusions Our evaluations show that BetweenNet is better at recovering known cancer genes based on multiple reference databases. Additionally, we show that the GO terms and the reference pathways enriched in BetweenNet ranked genes and those that are enriched in known cancer genes overlap significantly when compared to the overlaps achieved by the rankings of the alternative methods.

2020 ◽  
Author(s):  
Cesim Erten ◽  
Aissa Houdjedj ◽  
Hilal Kazan

AbstractBackgroundRecent cancer genomic studies have generated detailed molecular data on a large number of cancer patients. A key remaining problem in cancer genomics is the identification of driver genes. Results: We propose BetweenNet, a computational approach that integrates genomic data with a protein-protein interaction network to identify cancer driver genes. BetweenNet utilizes a measure based on betweenness centrality on patient specific networks to identify the so-called outlier genes that correspond to dysregulated genes for each patient. Setting up the relationship between the mutated genes and the outliers through a bipartite graph, it employs a random-walk process on the graph, which provides the final prioritization of the mutated genes. We compare BetweenNet against state-of-the art cancer gene prioritization methods on lung, breast, and pan-cancer datasets. Conclusions: Our evaluations show that BetweenNet is better at recovering known cancer genes based on multiple reference databases. Additionally, we show that the GO terms and the reference pathways enriched in BetweenNet ranked genes and those that are enriched in known cancer genes overlap significantly when compared to the overlaps achieved by the rankings of the alternative methods.


2021 ◽  
Vol 16 ◽  
Author(s):  
Xianghua Peng ◽  
Fang Liu ◽  
Ping Liu ◽  
Xing Li ◽  
Xinguo Lu

Aim: In exploiting cancer initialization and progression, a great challenge is to identify the driver genes. Background: With advances in next-generation sequencing (NGS) technologies, identification of specific oncogenic genes has emerged through integrating multi-omics data. Although the existing computational models have identified many common driver genes, they rely on individual regulatory mechanisms or independent copy number variants, ignoring the dynamic function of genes in pathways and networks. Objective: the molecular metabolic pathway is a critical biological process in tumor initiation, progression and maintenance. Establishing the role of genes in pathways and networks helps to describe their functional roles under physiological and pathological conditions at multiple levels. Methods: we present a metabolic pathway based driver genes identification (pathDriver) to distinguish different cancer types/subtypes. In pathDriver, combined with protein-protein interaction network, the metabolic pathway is utilized to construct the pathway network. Then the interaction frequency (IF) and inverse pathway frequency (IPF) is used to evaluate the collaborative impact factor of genes in the pathway network. Finally, the cancer-specific driver genes are identified by calculating the scores of edges connected to genes in the pathway network. Results: We applied it to 16 kinds of TCGA cancers for pan-cancer analysis. Connclusion: the driving pathway identified biologically significant known cancer genes and the potential new candidate genes.


2021 ◽  
Author(s):  
Cesim Erten ◽  
Aissa Houdjedj ◽  
Hilal Kazan ◽  
Ahmed Amine Taleb Bahmed

AbstractMotivationA major challenge in cancer genomics is to distinguish the driver mutations that are causally linked to cancer from passenger mutations that do not contribute to cancer development. The majority of existing methods provide a single driver gene list for the entire cohort of patients. However, since mutation profiles of patients from the same cancer type show a high degree of heterogeneity, a more ideal approach is to identify patient-specific drivers.ResultsWe propose a novel method that integrates genomic data, biological pathways, and protein connectivity information for personalized identification of driver genes. The method is formulated on a personalized bipartite graph for each patient. Our approach provides a personalized ranking of the mutated genes of a patient based on the sum of weighted ‘pairwise pathway coverage’ scores across all the patients, where appropriate pairwise patient similarity scores are used as weights to normalize these coverage scores. We compare our method against three state-of-the-art patient-specific cancer gene prioritization methods. The comparisons are with respect to a novel evaluation method that takes into account the personalized nature of the problem. We show that our approach outperforms the existing alternatives for both the TCGA and the cell-line data. Additionally, we show that the KEGG/Reactome pathways enriched in our ranked genes and those that are enriched in cell lines’ reference sets overlap significantly when compared to the overlaps achieved by the rankings of the alternative methods. Our findings can provide valuable information towards the development of personalized treatments and therapies.AvailabilityAll the code and necessary datasets are available at https://github.com/abu-compbio/[email protected] or [email protected]


Author(s):  
Martin Pirkl ◽  
Niko Beerenwinkel

Abstract Motivation Cancer is one of the most prevalent diseases in the world. Tumors arise due to important genes changing their activity, e.g. when inhibited or over-expressed. But these gene perturbations are difficult to observe directly. Molecular profiles of tumors can provide indirect evidence of gene perturbations. However, inferring perturbation profiles from molecular alterations is challenging due to error-prone molecular measurements and incomplete coverage of all possible molecular causes of gene perturbations. Results We have developed a novel mathematical method to analyze cancer driver genes and their patient-specific perturbation profiles. We combine genetic aberrations with gene expression data in a causal network derived across patients to infer unobserved perturbations. We show that our method can predict perturbations in simulations, CRISPR perturbation screens and breast cancer samples from The Cancer Genome Atlas. Availability and implementation The method is available as the R-package nempi at https://github.com/cbg-ethz/nempi and http://bioconductor.org/packages/nempi. Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Oriol Pich ◽  
Iker Reyes-Salazar ◽  
Abel Gonzalez-Perez ◽  
Nuria Lopez-Bigas

AbstractMutations in genes that confer a selective advantage to hematopoietic stem cells (HSCs) in certain conditions drive clonal hematopoiesis (CH). While some CH drivers have been identified experimentally or through epidemiological studies, the compendium of all genes able to drive CH upon mutations in HSCs is far from complete. We propose that identifying signals of positive selection in blood somatic mutations may be an effective way to identify CH driver genes, similarly as done to identify cancer genes. Using a reverse somatic variant calling approach, we repurposed whole-genome and whole-exome blood/tumor paired samples of more than 12,000 donors from two large cancer genomics cohorts to identify blood somatic mutations. The application of IntOGen, a robust driver discovery pipeline, to blood somatic mutations across both cohorts, and more than 24,000 targeted sequenced samples yielded a list of close to 70 genes with signals of positive selection in CH, available at http://www.intogen.org/ch. This approach recovers all known CH genes, and discovers novel candidates. Generating this compendium is an essential step to understand the molecular mechanisms of CH and to accurately detect individuals with CH to ascertain their risk to develop related diseases.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Felix Grassmann ◽  
Yudi Pawitan ◽  
Kamila Czene

Abstract Genes involved in cancer are under constant evolutionary pressure, potentially resulting in diverse molecular properties. In this study, we explore 23 omic features from publicly available databases to define the molecular profile of different classes of cancer genes. Cancer genes were grouped according to mutational landscape (germline and somatically mutated genes), role in cancer initiation (cancer driver genes) or cancer survival (survival genes), as well as being implicated by genome-wide association studies (GWAS genes). For each gene, we also computed feature scores based on all omic features, effectively summarizing how closely a gene resembles cancer genes of the respective class. In general, cancer genes are longer, have a lower GC content, have more isoforms with shorter exons, are expressed in more tissues and have more transcription factor binding sites than non-cancer genes. We found that germline genes more closely resemble single tissue GWAS genes while somatic genes are more similar to pleiotropic cancer GWAS genes. As a proof-of-principle, we utilized aggregated feature scores to prioritize genes in breast cancer GWAS loci and found that top ranking genes were enriched in cancer related pathways. In conclusion, we have identified multiple omic features associated with different classes of cancer genes, which can assist prioritization of genes in cancer gene discovery.


2020 ◽  
Vol 49 (D1) ◽  
pp. D1289-D1301 ◽  
Author(s):  
Tao Wang ◽  
Shasha Ruan ◽  
Xiaolu Zhao ◽  
Xiaohui Shi ◽  
Huajing Teng ◽  
...  

Abstract The prevalence of neutral mutations in cancer cell population impedes the distinguishing of cancer-causing driver mutations from passenger mutations. To systematically prioritize the oncogenic ability of somatic mutations and cancer genes, we constructed a useful platform, OncoVar (https://oncovar.org/), which employed published bioinformatics algorithms and incorporated known driver events to identify driver mutations and driver genes. We identified 20 162 cancer driver mutations, 814 driver genes and 2360 pathogenic pathways with high-confidence by reanalyzing 10 769 exomes from 33 cancer types in The Cancer Genome Atlas (TCGA) and 1942 genomes from 18 cancer types in International Cancer Genome Consortium (ICGC). OncoVar provides four points of view, ‘Mutation’, ‘Gene’, ‘Pathway’ and ‘Cancer’, to help researchers to visualize the relationships between cancers and driver variants. Importantly, identification of actionable driver alterations provides promising druggable targets and repurposing opportunities of combinational therapies. OncoVar provides a user-friendly interface for browsing, searching and downloading somatic driver mutations, driver genes and pathogenic pathways in various cancer types. This platform will facilitate the identification of cancer drivers across individual cancer cohorts and helps to rank mutations or genes for better decision-making among clinical oncologists, cancer researchers and the broad scientific community interested in cancer precision medicine.


2020 ◽  
Vol 11 (1) ◽  
Author(s):  
Antonio Colaprico ◽  
Catharina Olsen ◽  
Matthew H. Bailey ◽  
Gabriel J. Odom ◽  
Thilde Terkelsen ◽  
...  

AbstractCancer driver gene alterations influence cancer development, occurring in oncogenes, tumor suppressors, and dual role genes. Discovering dual role cancer genes is difficult because of their elusive context-dependent behavior. We define oncogenic mediators as genes controlling biological processes. With them, we classify cancer driver genes, unveiling their roles in cancer mechanisms. To this end, we present Moonlight, a tool that incorporates multiple -omics data to identify critical cancer driver genes. With Moonlight, we analyze 8000+ tumor samples from 18 cancer types, discovering 3310 oncogenic mediators, 151 having dual roles. By incorporating additional data (amplification, mutation, DNA methylation, chromatin accessibility), we reveal 1000+ cancer driver genes, corroborating known molecular mechanisms. Additionally, we confirm critical cancer driver genes by analysing cell-line datasets. We discover inactivation of tumor suppressors in intron regions and that tissue type and subtype indicate dual role status. These findings help explain tumor heterogeneity and could guide therapeutic decisions.


2021 ◽  
Vol 11 ◽  
Author(s):  
Chunyu Pan ◽  
Yuyan Zhu ◽  
Meng Yu ◽  
Yongkang Zhao ◽  
Changsheng Zhang ◽  
...  

BackgroundMYCN is an oncogenic transcription factor of the MYC family and plays an important role in the formation of tissues and organs during development before birth. Due to the difficulty in drugging MYCN directly, revealing the molecules in MYCN regulatory networks will help to identify effective therapeutic targets.MethodsWe utilized network controllability theory, a recent developed powerful tool, to identify the potential drug target around MYCN based on Protein-Protein interaction network of MYCN. First, we constructed a Protein-Protein interaction network of MYCN based on public databases. Second, network control analysis was applied on network to identify driver genes and indispensable genes of the MYCN regulatory network. Finally, we developed a novel integrated approach to identify potential drug targets for regulating the function of the MYCN regulatory network.ResultsWe constructed an MYCN regulatory network that has 79 genes and 129 interactions. Based on network controllability theory, we analyzed driver genes which capable to fully control the network. We found 10 indispensable genes whose alternation will significantly change the regulatory pathways of the MYCN network. We evaluated the stability and correlation analysis of these genes and found EGFR may be the potential drug target which closely associated with MYCN.ConclusionTogether, our findings indicate that EGFR plays an important role in the regulatory network and pathways of MYCN and therefore may represent an attractive therapeutic target for cancer treatment.


2016 ◽  
Author(s):  
wenjing Teng ◽  
Yan Li ◽  
Chao Zhou

Objective: To develop a protein-protein interaction network of rectal cancer, which is based on genetic genes as well as to predict biological pathways underlying the molecular complexes in the network. In order to analyze and summarize genetic markers related to diagnosis and prognosis of rectal cancer. Methods: the genes expression profile was downloaded from OMIM (Online Mendelian Inheritance in Man) database; the protein-protein interaction network of rectal cancer was established by Cytoscape; the molecular complexes in the network were detected by Clusterviz plugin and the pathways enrichment of molecular complexes were performed by DAVID online and Bingo (The Biological Networks Gene Ontology tool). Results and Discussion: A total of 127 rectal cancer genes were identified to differentially express in OMIM Database. The protein-protein interaction network of rectal cancer was contained 966 nodes (proteins), 3377 edges (interactive relationships) and 7 molecular complexes (score>7.0). Regulatory effects of genes and proteins were focused on cell cycle, transcription regulation and cellular protein metabolic process. Genes of DDK1, sparcl1, wisp2, cux1, pabpc1, ptk2 and htra1 were significant nodes in PPI network. The discovery of featured genes which were probably related to rectal cancer, has a great significance on studying mechanism, distinguishing normal and cancer tissues, and exploring new treatments for rectal cancer.


Sign in / Sign up

Export Citation Format

Share Document