scholarly journals Simulated Annealing Based Algorithm for Identifying Mutated Driver Pathways in Cancer

2014 ◽  
Vol 2014 ◽  
pp. 1-7 ◽  
Author(s):  
Hai-Tao Li ◽  
Yu-Lang Zhang ◽  
Chun-Hou Zheng ◽  
Hong-Qiang Wang

With the development of next-generation DNA sequencing technologies, large-scale cancer genomics projects can be implemented to help researchers to identify driver genes, driver mutations, and driver pathways, which promote cancer proliferation in large numbers of cancer patients. Hence, one of the remaining challenges is to distinguish functional mutations vital for cancer development, and filter out the unfunctional and random “passenger mutations.” In this study, we introduce a modified method to solve the so-called maximum weight submatrix problem which is used to identify mutated driver pathways in cancer. The problem is based on two combinatorial properties, that is, coverage and exclusivity. Particularly, we enhance an integrative model which combines gene mutation and expression data. The experimental results on simulated data show that, compared with the other methods, our method is more efficient. Finally, we apply the proposed method on two real biological datasets. The results show that our proposed method is also applicable in real practice.

PLoS ONE ◽  
2020 ◽  
Vol 15 (11) ◽  
pp. e0242780
Author(s):  
Houriiyah Tegally ◽  
Kevin H. Kensler ◽  
Zahra Mungloo-Dilmohamud ◽  
Anisah W. Ghoorah ◽  
Timothy R. Rebbeck ◽  
...  

As the genomic profile across cancers varies from person to person, patient prognosis and treatment may differ based on the mutational signature of each tumour. Thus, it is critical to understand genomic drivers of cancer and identify potential mutational commonalities across tumors originating at diverse anatomical sites. Large-scale cancer genomics initiatives, such as TCGA, ICGC and GENIE have enabled the analysis of thousands of tumour genomes. Our goal was to identify new cancer-causing mutations that may be common across tumour sites using mutational and gene expression profiles. Genomic and transcriptomic data from breast, ovarian, and prostate cancers were aggregated and analysed using differential gene expression methods to identify the effect of specific mutations on the expression of multiple genes. Mutated genes associated with the most differentially expressed genes were considered to be novel candidates for driver mutations, and were validated through literature mining, pathway analysis and clinical data investigation. Our driver selection method successfully identified 116 probable novel cancer-causing genes, with 4 discovered in patients having no alterations in any known driver genes: MXRA5, OBSCN, RYR1, and TG. The candidate genes previously not officially classified as cancer-causing showed enrichment in cancer pathways and in cancer diseases. They also matched expectations pertaining to properties of cancer genes, for instance, showing larger gene and protein lengths, and having mutation patterns suggesting oncogenic or tumor suppressor properties. Our approach allows for the identification of novel putative driver genes that are common across cancer sites using an unbiased approach without any a priori knowledge on pathways or gene interactions and is therefore an agnostic approach to the identification of putative common driver genes acting at multiple cancer sites.


2021 ◽  
Author(s):  
Theresa A Harbig ◽  
Sabrina Nusrat ◽  
Tali Mazor ◽  
Qianwen Wang ◽  
Alexander Thomson ◽  
...  

Molecular profiling of patient tumors and liquid biopsies over time with next-generation sequencing technologies and new immuno-profile assays are becoming part of standard research and clinical practice. With the wealth of new longitudinal data, there is a critical need for visualizations for cancer researchers to explore and interpret temporal patterns not just in a single patient but across cohorts. To address this need we developed OncoThreads, a tool for the visualization of longitudinal clinical and cancer genomics and other molecular data in patient cohorts. The tool visualizes patient cohorts as temporal heatmaps and Sankey diagrams that support the interactive exploration and ranking of a wide range of clinical and molecular features. This allows analysts to discover temporal patterns in longitudinal data, such as the impact of mutations on response to a treatment, e.g. emergence of resistant clones. We demonstrate the functionality of OncoThreads using a cohort of 23 glioma patients sampled at 2-4 timepoints. OncoThreads is freely available at http://oncothreads.gehlenborglab.org and implemented in Javascript using the cBioPortal web API as a backend.


2019 ◽  
Vol 20 (19) ◽  
pp. 4711 ◽  
Author(s):  
Ilda Patrícia Ribeiro ◽  
Joana Barbosa Melo ◽  
Isabel Marques Carreira

The availability of cytogenetics and cytogenomics technologies improved the detection and identification of tumor molecular signatures as well as the understanding of cancer initiation and progression. The use of large-scale and high-throughput cytogenomics technologies has led to a fast identification of several cancer candidate biomarkers associated with diagnosis, prognosis, and therapeutics. The advent of array comparative genomic hybridization and next-generation sequencing technologies has significantly improved the knowledge about cancer biology, underlining driver genes to guide targeted therapy development, drug-resistance prediction, and pharmacogenetics. However, few of these candidate biomarkers have made the transition to the clinic with a clear benefit for the patients. Technological progress helped to demonstrate that cellular heterogeneity plays a significant role in tumor progression and resistance/sensitivity to cancer therapies, representing the major challenge of precision cancer therapy. A paradigm shift has been introduced in cancer genomics with the recent advent of single-cell sequencing, since it presents a lot of applications with a clear benefit to oncological patients, namely, detection of intra-tumoral heterogeneity, mapping clonal evolution, monitoring the development of therapy resistance, and detection of rare tumor cell populations. It seems now evident that no single biomarker could provide the whole information necessary to early detect and predict the behavior and prognosis of tumors. The promise of precision medicine is based on the molecular profiling of tumors being vital the continuous progress of high-throughput technologies and the multidisciplinary efforts to catalogue chromosomal rearrangements and genomic alterations of human cancers and to do a good interpretation of the relation genotype—phenotype.


2021 ◽  
Author(s):  
Cesim Erten ◽  
Aissa Houdjedj ◽  
Hilal Kazan ◽  
Ahmed Amine Taleb Bahmed

AbstractMotivationA major challenge in cancer genomics is to distinguish the driver mutations that are causally linked to cancer from passenger mutations that do not contribute to cancer development. The majority of existing methods provide a single driver gene list for the entire cohort of patients. However, since mutation profiles of patients from the same cancer type show a high degree of heterogeneity, a more ideal approach is to identify patient-specific drivers.ResultsWe propose a novel method that integrates genomic data, biological pathways, and protein connectivity information for personalized identification of driver genes. The method is formulated on a personalized bipartite graph for each patient. Our approach provides a personalized ranking of the mutated genes of a patient based on the sum of weighted ‘pairwise pathway coverage’ scores across all the patients, where appropriate pairwise patient similarity scores are used as weights to normalize these coverage scores. We compare our method against three state-of-the-art patient-specific cancer gene prioritization methods. The comparisons are with respect to a novel evaluation method that takes into account the personalized nature of the problem. We show that our approach outperforms the existing alternatives for both the TCGA and the cell-line data. Additionally, we show that the KEGG/Reactome pathways enriched in our ranked genes and those that are enriched in cell lines’ reference sets overlap significantly when compared to the overlaps achieved by the rankings of the alternative methods. Our findings can provide valuable information towards the development of personalized treatments and therapies.AvailabilityAll the code and necessary datasets are available at https://github.com/abu-compbio/[email protected] or [email protected]


2018 ◽  
Author(s):  
Collin Tokheim ◽  
Rachel Karchin

SummaryLarge-scale cancer sequencing studies of patient cohorts have statistically implicated many genes driving cancer growth and progression, and their identification has yielded substantial translational impact. However, a remaining challenge is to increase the resolution of driver prediction from the gene level to the mutation level, because mutation-level predictions are more closely aligned with the goal of precision cancer medicine. Here we present CHASMplus, a computational method, that is uniquely capable of identifying driver missense mutations, including those specific to a cancer type, as evidenced by significantly superior performance on diverse benchmarks. Applied to 8,657 tumor samples across 32 cancer types in The Cancer Genome Atlas, CHASMplus identifies over 4,000 unique driver missense mutations in 240 genes, supporting a prominent role for rare driver mutations. We show which TCGA cancer types are likely to yield discovery of new driver missense mutations by additional sequencing, which has important implications for public policy.SignificanceMissense mutations are the most frequent mutation type in cancers and the most difficult to interpret. While many computational methods have been developed to predict whether genes are cancer drivers or whether missense mutations are generally deleterious or pathogenic, there has not previously been a method to score the oncogenic impact of a missense mutation specifically by cancer type, limiting adoption of computational missense mutation predictors in the clinic. Cancer patients are routinely sequenced with targeted panels of cancer driver genes, but such genes contain a mixture of driver and passenger missense mutations which differ by cancer type. A patient’s therapeutic response to drugs and optimal assignment to a clinical trial depends on both the specific mutation in the gene of interest and cancer type. We present a new machine learning method honed for each TCGA cancer type, and a resource for fast lookup of the cancer-specific driver propensity of every possible missense mutation in the human exome.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Shumaila Sayyab ◽  
Anders Lundmark ◽  
Malin Larsson ◽  
Markus Ringnér ◽  
Sara Nystedt ◽  
...  

AbstractThe mechanisms driving clonal heterogeneity and evolution in relapsed pediatric acute lymphoblastic leukemia (ALL) are not fully understood. We performed whole genome sequencing of samples collected at diagnosis, relapse(s) and remission from 29 Nordic patients. Somatic point mutations and large-scale structural variants were called using individually matched remission samples as controls, and allelic expression of the mutations was assessed in ALL cells using RNA-sequencing. We observed an increased burden of somatic mutations at relapse, compared to diagnosis, and at second relapse compared to first relapse. In addition to 29 known ALL driver genes, of which nine genes carried recurrent protein-coding mutations in our sample set, we identified putative non-protein coding mutations in regulatory regions of seven additional genes that have not previously been described in ALL. Cluster analysis of hundreds of somatic mutations per sample revealed three distinct evolutionary trajectories during ALL progression from diagnosis to relapse. The evolutionary trajectories provide insight into the mutational mechanisms leading relapse in ALL and could offer biomarkers for improved risk prediction in individual patients.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Cesim Erten ◽  
Aissa Houdjedj ◽  
Hilal Kazan

Abstract Background Recent cancer genomic studies have generated detailed molecular data on a large number of cancer patients. A key remaining problem in cancer genomics is the identification of driver genes. Results We propose BetweenNet, a computational approach that integrates genomic data with a protein-protein interaction network to identify cancer driver genes. BetweenNet utilizes a measure based on betweenness centrality on patient specific networks to identify the so-called outlier genes that correspond to dysregulated genes for each patient. Setting up the relationship between the mutated genes and the outliers through a bipartite graph, it employs a random-walk process on the graph, which provides the final prioritization of the mutated genes. We compare BetweenNet against state-of-the art cancer gene prioritization methods on lung, breast, and pan-cancer datasets. Conclusions Our evaluations show that BetweenNet is better at recovering known cancer genes based on multiple reference databases. Additionally, we show that the GO terms and the reference pathways enriched in BetweenNet ranked genes and those that are enriched in known cancer genes overlap significantly when compared to the overlaps achieved by the rankings of the alternative methods.


2021 ◽  
Author(s):  
Parsoa Khorsand ◽  
Fereydoun Hormozdiari

Abstract Large scale catalogs of common genetic variants (including indels and structural variants) are being created using data from second and third generation whole-genome sequencing technologies. However, the genotyping of these variants in newly sequenced samples is a nontrivial task that requires extensive computational resources. Furthermore, current approaches are mostly limited to only specific types of variants and are generally prone to various errors and ambiguities when genotyping complex events. We are proposing an ultra-efficient approach for genotyping any type of structural variation that is not limited by the shortcomings and complexities of current mapping-based approaches. Our method Nebula utilizes the changes in the count of k-mers to predict the genotype of structural variants. We have shown that not only Nebula is an order of magnitude faster than mapping based approaches for genotyping structural variants, but also has comparable accuracy to state-of-the-art approaches. Furthermore, Nebula is a generic framework not limited to any specific type of event. Nebula is publicly available at https://github.com/Parsoa/Nebula.


Genetics ◽  
2003 ◽  
Vol 165 (4) ◽  
pp. 2269-2282
Author(s):  
D Mester ◽  
Y Ronin ◽  
D Minkov ◽  
E Nevo ◽  
A Korol

Abstract This article is devoted to the problem of ordering in linkage groups with many dozens or even hundreds of markers. The ordering problem belongs to the field of discrete optimization on a set of all possible orders, amounting to n!/2 for n loci; hence it is considered an NP-hard problem. Several authors attempted to employ the methods developed in the well-known traveling salesman problem (TSP) for multilocus ordering, using the assumption that for a set of linked loci the true order will be the one that minimizes the total length of the linkage group. A novel, fast, and reliable algorithm developed for the TSP and based on evolution-strategy discrete optimization was applied in this study for multilocus ordering on the basis of pairwise recombination frequencies. The quality of derived maps under various complications (dominant vs. codominant markers, marker misclassification, negative and positive interference, and missing data) was analyzed using simulated data with ∼50-400 markers. High performance of the employed algorithm allows systematic treatment of the problem of verification of the obtained multilocus orders on the basis of computing-intensive bootstrap and/or jackknife approaches for detecting and removing questionable marker scores, thereby stabilizing the resulting maps. Parallel calculation technology can easily be adopted for further acceleration of the proposed algorithm. Real data analysis (on maize chromosome 1 with 230 markers) is provided to illustrate the proposed methodology.


2021 ◽  
pp. 1-10
Author(s):  
Yang Ma ◽  
Jingxia Zhao ◽  
Yun Du ◽  
Rui Wang ◽  
Xiaokun Ji ◽  
...  

<b><i>Objective:</i></b> The aim of the study was to investigate the mutation status of multiple driver genes by RT-qPCR and their significance in advanced lung adenocarcinoma using cytological specimens. <b><i>Materials and Methods:</i></b> 155 cytological specimens that had been diagnosed with lung adenocarcinoma in the Fourth Hospital of Hebei Medical University were selected from April to November 2019. The cytological specimens included serous cavity effusion and fine-needle aspiration biopsies. Among cytological specimens, 108 cases were processed by using the cell block method (CBM), and 47 cases were processed by the disposable membrane cell collector method (MCM) before DNA/RNA extraction. Ten drive genes of EGFR, ALK, ROS1, BRAF, KRAS, NRAS, HER2, RET, PIK3CA, and MET were combined detected at one step by the amplification refractory mutation system and ABI 7500 RT-qPCR. <b><i>Results:</i></b> The purity of RNA (<i>p</i> = 0.005) and DNA (<i>p</i> = 0.001) extracted by using the MCM was both significantly higher than that extracted by using the CBM. Forty-seven cases of fresh cell specimens processed by the MCM all succeeded in multigene detections, while of 108 specimens processed by the CBM, 6 cases failed in multigene detections. Among 149 specimens, single-gene mutation rates of EGFR, ALK, ROS1, RET, HER2, MET, KRAS, NRAS, BRAF, and PIK3CA mutations were 57.71%, 6.04%, 3.36%, 2.68%, 2.01%, 2.01%, 1.34%, 0.67%, 0% and 0% respectively, and 6 cases including 2 coexistence mutations. We found that mutation status was correlated with gender (<i>p</i> = 0.047), but not correlated with age (<i>p</i> = 0.141) and smoking status (<i>p</i> = 0.083). We found that the EGFR mutation status was correlated with gender (<i>p</i> = 0.003), age (<i>p</i> = 0.015) and smoking habits (<i>p</i> = 0.007), and ALK mutation status was correlated with age (<i>p</i> = 0.002). <b><i>Conclusion:</i></b> Compared with the CBM, the MCM can improve the efficiency of DNA/RNA extraction and PCR amplification by removing impurities and enriching tumor cells. And we speculate that the successful detection rate of fresh cytological specimens was higher than that of paraffin-embedded specimens. EGFR, ALK, and ROS1 mutations were the main driver mutations in patients with advanced lung adenocarcinoma. We speculate that EGFR and ALK are more prone to concomitant mutations, respectively. Targeted therapies for patients with coexisting mutations need further study.


Sign in / Sign up

Export Citation Format

Share Document