scholarly journals gcMECM: graph clustering of mutual exclusivity of cancer mutations

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Ying Hu ◽  
Chunhua Yan ◽  
Qingrong Chen ◽  
Daoud Meerzaman

Abstract Background Next-generation sequencing platforms allow us to sequence millions of small fragments of DNA simultaneously, revolutionizing cancer research. Sequence analysis has revealed that cancer driver genes operate across multiple intricate pathways and networks with mutations often occurring in a mutually exclusive pattern. Currently, low-frequency mutations are understudied as cancer-relevant genes, especially in the context of networks. Results Here we describe a tool, gcMECM, that enables us to visualize the functionality of mutually exclusive genes in the subnetworks derived from mutation associations, gene–gene interactions, and graph clustering. These subnetworks have revealed crucial biological components in the canonical pathway, especially those mutated at low frequency. Examining the subnetwork, and not just the impact of a single gene, significantly increases the statistical power of clinical analysis and enables us to build models to better predict how and why cancer develops. Conclusions gcMECM uses a computationally efficient and scalable algorithm to identify subnetworks in a canonical pathway with mutually exclusive mutation patterns and distinct biological functions.

2021 ◽  
Author(s):  
Nikolaos Lykoskoufis ◽  
Evarist Planet ◽  
Halit Ongen ◽  
Didier Trono ◽  
Emmanouil T Dermitzakis

Abstract Transposable elements (TEs) are interspersed repeats that contribute to more than half of the human genome, and TE-embedded regulatory sequences are increasingly recognized as major components of the human regulome. Perturbations of this system can contribute to tumorigenesis, but the impact of TEs on gene expression in cancer cells remains to be fully assessed. Here, we analyzed 275 normal colon and 276 colorectal cancer (CRC) samples from the SYSCOL colorectal cancer cohort and discovered 10,111 and 5,152 TE expression quantitative trait loci (eQTLs) in normal and tumor tissues, respectively. Amongst the latter, 376 were exclusive to CRC, likely driven by changes in methylation patterns. We identified that transcription factors are more enriched in tumor-specific TE-eQTLs than shared TE-eQTLs, indicating that TEs are more specifically regulated in tumor than normal. Using Bayesian Networks to assess the causal relationship between eQTL variants, TEs and genes, we identified that 1,758 TEs are mediators of genetic effect, altering the expression of 1,626 nearby genes significantly more in tumor compared to normal, of which 51 are cancer driver genes. We show that tumor-specific TE-eQTLs trigger the driver capability of TEs subsequently impacting expression of nearby genes. Collectively, our results highlight a global profile of a new class of cancer drivers, thereby enhancing our understanding of tumorigenesis and providing potential new candidate mechanisms for therapeutic target development.


2019 ◽  
Vol 12 (S7) ◽  
Author(s):  
Ying Hui ◽  
Pi-Jing Wei ◽  
Junfeng Xia ◽  
Yu-Tian Wang ◽  
Chun-Hou Zheng

Abstract Background Although there are huge volumes of genomic data, how to decipher them and identify driver events is still a challenge. The current methods based on network typically use the relationship between genomic events and consequent changes in gene expression to nominate putative driver genes. But there may exist some relationships within the transcriptional network. Methods We developed MECoRank, a novel method that improves the recognition accuracy of driver genes. MECoRank is based on bipartite graph to propagates the scores via an iterative process. After iteration, we will obtain a ranked gene list for each patient sample. Then, we applied the Condorcet voting method to determine the most impactful drivers in a population. Results We applied MECoRank to three cancer datasets to reveal candidate driver genes which have a greater impact on gene expression. Experimental results show that our method not only can identify more driver genes that have been validated than other methods, but also can recognize some impactful novel genes which have been proved to be more important in literature. Conclusions We propose a novel approach named MECoRank to prioritize driver genes based on their impact on the expression in the molecular interaction network. This method not only assesses mutation’s effect on the transcriptional network, but also assesses the differential expression’s effect within the transcriptional network. And the results demonstrated that MECoRank has better performance than the other competing approaches in identifying driver genes.


2021 ◽  
Vol 10 (9) ◽  
pp. 1827
Author(s):  
Camille Péneau ◽  
Jessica Zucman-Rossi ◽  
Jean-Charles Nault

Virus-related liver carcinogenesis is one of the main contributors of cancer-related death worldwide mainly due to the impact of chronic hepatitis B and C infections. Three mechanisms have been proposed to explain the oncogenic properties of hepatitis B virus (HBV) infection: induction of chronic inflammation and cirrhosis, expression of HBV oncogenic proteins, and insertional mutagenesis into the genome of infected hepatocytes. Hepatitis B insertional mutagenesis modifies the function of cancer driver genes and could promote chromosomal instability. In contrast, hepatitis C virus promotes hepatocellular carcinoma (HCC) occurrence mainly through cirrhosis development whereas the direct oncogenic role of the virus in human remains debated. Finally, adeno associated virus type 2 (AAV2), a defective DNA virus, has been associated with occurrence of HCC harboring insertional mutagenesis of the virus. Since these tumors developed in a non-cirrhotic context and in the absence of a known etiological factor, AAV2 appears to be the direct cause of tumor development in these patients via a mechanism of insertional mutagenesis altering similar oncogenes and tumor suppressor genes targeted by HBV. A better understanding of virus-related oncogenesis will be helpful to develop new preventive strategies and therapies directed against specific alterations observed in virus-related HCC.


2021 ◽  
Author(s):  
Nikolaos M. R. Lykoskoufis ◽  
Evarist Planet ◽  
Halit Ongen ◽  
Didier Trono ◽  
Emmanouil T. Dermitzakis

ABSTRACTTransposable elements (TEs) are interspersed repeats that contribute to more than half of the human genome, and TE-embedded regulatory sequences are increasingly recognized as major components of the human regulome. Perturbations of this system can contribute to tumorigenesis, but the impact of TEs on gene expression in cancer cells remains to be fully assessed. Here, we analyzed 275 normal colon and 276 colorectal cancer (CRC) samples from the SYSCOL colorectal cancer cohort and discovered 10,111 and 5,152 TE expression quantitative trait loci (eQTLs) in normal and tumor tissues, respectively. Amongst the latter, 376 were exclusive to CRC, likely driven by changes in methylation patterns. We identified that transcription factors are more enriched in tumor-specific TE-eQTLs than shared TE-eQTLs, indicating that TEs are more specifically regulated in tumor than normal. Using Bayesian Networks to assess the causal relationship between eQTL variants, TEs and genes, we identified that 1,758 TEs are mediators of genetic effect, altering the expression of 1,626 nearby genes significantly more in tumor compared to normal, of which 51 are cancer driver genes. We show that tumor-specific TE-eQTLs trigger the driver capability of TEs subsequently impacting expression of nearby genes. Collectively, our results highlight a global profile of a new class of cancer drivers, thereby enhancing our understanding of tumorigenesis and providing potential new candidate mechanisms for therapeutic target development.


2020 ◽  
Author(s):  
Vu VH Pham ◽  
Lin Liu ◽  
Cameron P Bracken ◽  
Gregory J Goodall ◽  
Jiuyong Li ◽  
...  

AbstractMotivationIdentifying cancer driver genes is a key task in cancer informatics. Most exisiting methods are focused on individual cancer drivers which regulate biological processes leading to cancer. However, the effect of a single gene may not be sufficient to drive cancer progression. Here, we hypothesise that there are driver gene groups that work in concert to regulate cancer and we develop a novel computational method to detect those driver gene groups.ResultsWe develop a novel method named DriverGroup to detect driver gene groups by using gene expression and gene interaction data. The proposed method has three stages: (1) Constructing the gene network, (2) Discovering critical nodes of the constructed network, and (3) Identifying driver gene groups based on the discovered critical nodes. Before evaluating the performance of DriverGroup in detecting cancer driver groups, we firstly assess its performance in detecting the influence of gene groups, a key step of DriverGroup. The application of DriverGroup to DREAM4 data demonstrates that it is more effective than other methods in detecting the regulation of gene groups. We then apply DriverGroup to the BRCA dataset to identify coding and non-coding driver groups for breast cancer. The identified driver groups are promising as several group members are confirmed to be related to cancer in literature. We further use the predicted driver groups in survival analysis and the results show that the survival curves of patient subpopulations classified using the predicted driver groups are significantly differentiated, indicating the usefulness of DriverGroup.Availability and implementationDriverGroup is available at https://github.com/pvvhoang/[email protected] informationSupplementary data are available at Bioinformatics online.


2020 ◽  
Vol 36 (Supplement_2) ◽  
pp. i583-i591
Author(s):  
Vu V H Pham ◽  
Lin Liu ◽  
Cameron P Bracken ◽  
Gregory J Goodall ◽  
Jiuyong Li ◽  
...  

Abstract Motivation Identifying cancer driver genes is a key task in cancer informatics. Most existing methods are focused on individual cancer drivers which regulate biological processes leading to cancer. However, the effect of a single gene may not be sufficient to drive cancer progression. Here, we hypothesize that there are driver gene groups that work in concert to regulate cancer, and we develop a novel computational method to detect those driver gene groups. Results We develop a novel method named DriverGroup to detect driver gene groups by using gene expression and gene interaction data. The proposed method has three stages: (i) constructing the gene network, (ii) discovering critical nodes of the constructed network and (iii) identifying driver gene groups based on the discovered critical nodes. Before evaluating the performance of DriverGroup in detecting cancer driver groups, we firstly assess its performance in detecting the influence of gene groups, a key step of DriverGroup. The application of DriverGroup to DREAM4 data demonstrates that it is more effective than other methods in detecting the regulation of gene groups. We then apply DriverGroup to the BRCA dataset to identify driver groups for breast cancer. The identified driver groups are promising as several group members are confirmed to be related to cancer in literature. We further use the predicted driver groups in survival analysis and the results show that the survival curves of patient subpopulations classified using the predicted driver groups are significantly differentiated, indicating the usefulness of DriverGroup. Availability and implementation DriverGroup is available at https://github.com/pvvhoang/DriverGroup Supplementary information Supplementary data are available at Bioinformatics online.


2022 ◽  
Vol 12 (1) ◽  
Author(s):  
Malvika Sudhakar ◽  
Raghunathan Rengaswamy ◽  
Karthik Raman

AbstractAn emergent area of cancer genomics is the identification of driver genes. Driver genes confer a selective growth advantage to the cell. While several driver genes have been discovered, many remain undiscovered, especially those mutated at a low frequency across samples. This study defines new features and builds a pan-cancer model, cTaG, to identify new driver genes. The features capture the functional impact of the mutations as well as their recurrence across samples, which helps build a model unbiased to genes with low frequency. The model classifies genes into the functional categories of driver genes, tumour suppressor genes (TSGs) and oncogenes (OGs), having distinct mutation type profiles. We overcome overfitting and show that certain mutation types, such as nonsense mutations, are more important for classification. Further, cTaG was employed to identify tissue-specific driver genes. Some known cancer driver genes predicted by cTaG as TSGs with high probability are ARID1A, TP53, and RB1. In addition to these known genes, potential driver genes predicted are CD36, ZNF750 and ARHGAP35 as TSGs and TAB3 as an oncogene. Overall, our approach surmounts the issue of low recall and bias towards genes with high mutation rates and predicts potential new driver genes for further experimental screening. cTaG is available at https://github.com/RamanLab/cTaG.


2020 ◽  
Author(s):  
Malvika Sudhakar ◽  
Raghunathan Rengaswamy ◽  
Karthik Raman

ABSTRACTAn emergent area of cancer genomics has been the identification of driver genes. Driver genes confer a selective growth advantage to the cell and push it towards tumorigenesis. Functionally, driver genes can be divided into two categories, tumour suppressor genes (TSGs) and oncogenes (OGs), which have distinct mutation type profiles. While several driver genes have been discovered, many remain undiscovered, especially those that are mutated at a low frequency across samples. The current methods are not sufficient to predict all driver genes because the underlying characteristics of these genes are not yet well understood. Thus, to predict novel genes, we need to define new features and models that are not biased and identify genes that might otherwise be overshadowed by mutation profiles of recurrent driver genes. In this study, we define new features and build a model to identify novel driver genes. We overcome overfitting and show that certain mutation types such as nonsense mutations are more important for classification. Some known cancer driver genes, which are predicted by the model as TSGs with high probability are ARID1A, TP53, and RB1. In addition to these known genes, potential driver genes predicted are CD36, ZNF750 and ARHGAP35 as TSGs and TAB3 as an oncogene. Overall, our approach surmounts the issue of low recall and bias towards genes with high mutation rates and predicts potential novel driver genes for further experimental screening.


2020 ◽  
Author(s):  
Gulden Olgun ◽  
Oznur Tastan

AbstractThe dysregulation of long non-coding RNAs’ (lncRNAs) expressions has been implicated in cancer. Since most of the lncRNAs’ are not functionally characterized well, investigating the set of perturbed lncRNAs are is challenging. Existing methods that inspect lncRNAs functionally rely on the co-expressed coding genes, which are far better characterized functionally. LncRNAs can be known to act as transcriptional regulators; they may activate or repress the neighborhood’s coding genes on the genome. Based on this, in this work, we aim to analyze the deregulated lncRNAs in cancer by taking into account their ability to regulate nearby loci on the genome. We perform functional analysis on differentially expressed lncRNAs for 28 different cancers considering their adjacent coding genes. We identify that some deregulated lncRNAs are cancer-specific, but a substantial number of lncRNAs are shared across cancers. Next, we assess the similarities of the cancer types based on the functional enrichment of the deregulated lncRNA sets. We find some cancers are very similar in the functions and biological processes related to the deregulated lncRNAs. We observe that some of the cancers for which we find similarity can be linked through primary, metastatic site relations. We investigate the similarity of enriched functional terms for the deregulated lncRNAs and the mRNAs. We further assess the enriched functions’ similarity to the functions and processes that the known cancer driver genes take place. We believe that our methodology help to understand the impact of the lncRNAs in cancer functionally.


2019 ◽  
Author(s):  
Tuan Trieu ◽  
Ekta Khurana

Three-dimensional structures of the genome play an important role in regulating the expression of genes. Non-coding variants have been shown to alter 3D genome structures to activate oncogenes in cancer. However, there is currently no method to predict the effect of DNA variants on 3D structures. We propose a deep learning method, DeepMILO, to learn DNA sequence features of CTCF/cohesin-mediated loops and to predict the effect of variants on these loops. DeepMILO consists of a convolutional and a recurrent neural network, and it can learn features beyond the presence of CTCF motifs and their orientations. Application of DeepMILO on a cohort of 241 malignant lymphoma patients with whole-genome sequences revealed CTCF/cohesin-mediated loops disrupted in multiple patients. These disrupted loops contain known cancer driver genes and novel genes. Our results show mutations at loop boundaries are associated with upregulation of the cancer driver gene BCL2 and may point to a possible new mechanism for its dysregulation via alteration of 3D loop structures.


Sign in / Sign up

Export Citation Format

Share Document