scholarly journals WITER: A powerful method for the estimation of cancer-driver genes using a weighted iterative regression accurately modelling background mutation rate

2018 ◽  
Author(s):  
Lin Jiang ◽  
Jingjing Zheng ◽  
Johnny Sheung Him Kwan ◽  
Sheng Dai ◽  
Cong Li ◽  
...  

AbstractGenomic identification of driver mutations and genes in cancer cells are critical for precision medicine. Due to difficulty in modeling distribution of background mutations, existing statistical methods are often underpowered to discriminate driver genes from passenger genes. Here we propose a novel statistical approach, weighted iterative zero-truncated negative-binomial regression (WITER), to detect cancer-driver genes showing an excess of somatic mutations. By solving the problem of inaccurately modeling background mutations, this approach works even in small or moderate samples. Compared to alternative methods, it detected more significant and cancer-consensus genes in all tested cancers. Applying this approach, we estimated 178 driver genes in 26 different cancers types. In silico validation confirmed 90.5% of predicted genes as likely known drivers and 7 genes unique for individual cancers as likely new drivers. The technical advances of WITER enable the detection of driver genes in TCGA datasets as small as 30 subjects, rescuing more genes missed by alternative tools.

2019 ◽  
Vol 47 (16) ◽  
pp. e96-e96 ◽  
Author(s):  
Lin Jiang ◽  
Jingjing Zheng ◽  
Johnny S H Kwan ◽  
Sheng Dai ◽  
Cong Li ◽  
...  

Abstract Genomic identification of driver mutations and genes in cancer cells are critical for precision medicine. Due to difficulty in modelling distribution of background mutation counts, existing statistical methods are often underpowered to discriminate cancer-driver genes from passenger genes. Here we propose a novel statistical approach, weighted iterative zero-truncated negative-binomial regression (WITER, http://grass.cgs.hku.hk/limx/witer or KGGSeq,http://grass.cgs.hku.hk/limx/kggseq/), to detect cancer-driver genes showing an excess of somatic mutations. By fitting the distribution of background mutation counts properly, this approach works well even in small or moderate samples. Compared to alternative methods, it detected more significant and cancer-consensus genes in most tested cancers. Applying this approach, we estimated 229 driver genes in 26 different types of cancers. In silico validation confirmed 78% of predicted genes as likely known drivers and many other genes as very likely new drivers for corresponding cancers. The technical advances of WITER enable the detection of driver genes in TCGA datasets as small as 30 subjects and rescue of more genes missed by alternative tools in moderate or small samples.


2020 ◽  
Vol 3 (1) ◽  
Author(s):  
Xiaobao Dong ◽  
Dandan Huang ◽  
Xianfu Yi ◽  
Shijie Zhang ◽  
Zhao Wang ◽  
...  

AbstractMutation-specific effects of cancer driver genes influence drug responses and the success of clinical trials. We reasoned that these effects could unbalance the distribution of each mutation across different cancer types, as a result, the cancer preference can be used to distinguish the effects of the causal mutation. Here, we developed a network-based framework to systematically measure cancer diversity for each driver mutation. We found that half of the driver genes harbor cancer type-specific and pancancer mutations simultaneously, suggesting that the pervasive functional heterogeneity of the mutations from even the same driver gene. We further demonstrated that the specificity of the mutations could influence patient drug responses. Moreover, we observed that diversity was generally increased in advanced tumors. Finally, we scanned potentially novel cancer driver genes based on the diversity spectrum. Diversity spectrum analysis provides a new approach to define driver mutations and optimize off-label clinical trials.


2018 ◽  
Author(s):  
Paul Ashford ◽  
Camilla S.M. Pang ◽  
Aurelio A. Moya-García ◽  
Tolulope Adeyelu ◽  
Christine A. Orengo

Tumour sequencing identifies highly recurrent point mutations in cancer driver genes, but rare functional mutations are hard to distinguish from large numbers of passengers. We developed a novel computational platform applying a multi-modal approach to filter out passengers and more robustly identify putative driver genes. The primary filter identifies enrichment of cancer mutations in CATH functional families (CATH-FunFams) – structurally and functionally coherent sets of evolutionary related domains. Using structural representatives from CATH-FunFams, we subsequently seek enrichment of mutations in 3D and show that these mutation clusters have a very significant tendency to lie close to known functional sites or conserved sites predicted using CATH-FunFams. Our third filter identifies enrichment of putative driver genes in functionally coherent protein network modules confirmed by literature analysis to be cancer associated.Our approach is complementary to other domain enrichment approaches exploiting Pfam families, but benefits from more functionally coherent groupings of domains. Using a set of mutations from 22 cancers we detect 151 putative cancer drivers, of which 79 are not listed in cancer resources and include recently validated cancer genes EPHA7, DCC netrin-1 receptor and zinc-finger protein ZNF479.


2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Paul Ashford ◽  
Camilla S. M. Pang ◽  
Aurelio A. Moya-García ◽  
Tolulope Adeyelu ◽  
Christine A. Orengo

2002 ◽  
Vol 32 (1) ◽  
pp. 247-265 ◽  
Author(s):  
Paul D. Allison ◽  
Richard P. Waterman

This paper demonstrates that the conditional negative binomial model for panel data, proposed by Hausman, Hall, and Griliches (1984), is not a true fixed-effects method. This method—which has been implemented in both Stata and LIMDEP—does not in fact control for all stable covariates. Three alternative methods are explored. A negative multinomial model yields the same estimator as the conditional Poisson estimator and hence does not provide any additional leverage for dealing with over-dispersion. On the other hand, a simulation study yields good results from applying an unconditional negative binomial regression estimator with dummy variables to represent the fixed effects. There is no evidence for any incidental parameters bias in the coefficients, and downward bias in the standard error estimates can be easily and effectively corrected using the deviance statistic. Finally, an approximate conditional method is found to perform at about the same level as the unconditional estimator.


2015 ◽  
Author(s):  
Chengliang Dong ◽  
Hui Yang ◽  
Zeyu He ◽  
Xiaoming Liu ◽  
Kai Wang

All cancers arise as a result of the acquisition of somatic mutations that drive the disease progression. A number of computational tools have been developed to identify driver genes for a specific cancer from a group of cancer samples. However, it remains a challenge to identify driver mutations/genes for an individual patient and design drug therapies. We developed iCAGES, a novel statistical framework to rapidly analyze patient-specific cancer genomic data, prioritize personalized cancer driver events and predict personalized therapies. iCAGES includes three consecutive layers: the first layer integrates contributions from coding, non-coding and structural variations to infer driver variants. For coding mutations, we developed a radial support vector machine using manually curated mutations to predict their driver potential. The second layer identifies driver genes, by using information from the first layer and integrating prior biological knowledge on gene-gene and gene-phenotype networks. The third layer prioritizes personalized drug treatment, by classifying potential driver genes into different categories and querying drug-gene databases. Compared to currently available tools, iCAGES achieves better performance by correctly classifying point coding driver mutations (AUC=0.97, 95% CI: 0.97-0.97, significantly better than the second best tool with P=0.01) and genes (AUC=0.93, 95% CI: 0.93-0.94, significantly better than MutSigCV with P<1X10-15). We also illustrated two examples where iCAGES correctly nominated two targeted drugs for two advanced cancer patients with exceptional response, based on their somatic mutation profiles. iCAGES leverages personal genomic information and prior biological knowledge, effectively identifies cancer driver genes and predicts treatment strategies. iCAGES is available at http://icages.usc.edu.


2019 ◽  
Author(s):  
Sara Althubaiti ◽  
Andreas Karwath ◽  
Ashraf Dallol ◽  
Adeeb Noor ◽  
Shadi Salem Alkhayyat ◽  
...  

AbstractIdentifying and distinguishing cancer driver genes among thousands of candidate mutations remains a major challenge. Accurate identification of driver genes and driver mutations is critical for advancing cancer research and personalizing treatment based on accurate stratification of patients. Due to inter-tumor genetic heterogeneity, many driver mutations within a gene occur at low frequencies, which make it challenging to distinguish them from non-driver mutations. We have developed a novel method for identifying cancer driver genes. Our approach utilizes multiple complementary types of information, specifically cellular phenotypes, cellular locations, functions, and whole body physiological phenotypes as features. We demonstrate that our method can accurately identify known cancer driver genes and distinguish between their role in different types of cancer. In addition to confirming known driver genes, we identify several novel candidate driver genes. We demonstrate the utility of our method by validating its predictions in nasopharyngeal cancer and colorectal cancer using whole exome and whole genome sequencing.


2019 ◽  
Vol 3 (1) ◽  
pp. 91-104
Author(s):  
Diva Arum Mustika ◽  
Rani Nooraeni ◽  
Indonesian Journal of Statistics and Its Applications IJSA

Diphtheria is an infectious disease caused by the Corynebacterium diphtheriae bacteria. Indonesia is the country with the most cases of diphtheria in Southeast Asia and ranks third in the world. In 2016, cases of diphtheria increased by 65 percent and became Extraordinary Events (KLB) in Indonesia, even though during 2013 to 2015 the number of cases of diphtheria has decreased. The province that has the highest number of diphtheria cases in Indonesia in 2016 is East Java. Diphtheria is centered and spread in certain districts / cities in East Java Province so that there are indications of spatial effects in the spread of diphtheria. Because data on the number of diphtheria cases overdispersed and indicated spatial effects in its spread, the main method used in this study was Geographically Weighted Negative Binomial Regression (GWNBR). This method will be compared with other alternative methods namely Poisson regression method and Negative Binomial Regression to get the best modeling. Based on the AIC value of each model it can be concluded that the best method for modeling the number of diphtheria cases is GWNBR. The modeling results with GWNBR show that there is indeed a spatial influence on the number of diphtheria cases and risk factors in East Java Province in 2016. The percentage of DPT-HB3 / DPT-HB-Hib3 immunization coverage is not significant in all observation areas, while the percentage of drug and vaccine availability is significant at entire observation area.


2019 ◽  
Vol 37 (15_suppl) ◽  
pp. e13144-e13144
Author(s):  
Elisa Frullanti ◽  
Maria Palmieri ◽  
Margherita Baldassarri ◽  
Francesca Fava ◽  
Alessandra Fabbiani ◽  
...  

e13144 Background: More than 50% of solid cancers sooner or later escape control of standard treatments. Detection and analysis of cell free circulating DNA (cfDNA) now offer the possibility to detect key mutations of cancer driver genes which may play a major role in the therapy escaping mechanism. We sought to identify clones of solid tumors escaping standard treatments in order to assess personalized treatment at PD. Methods: A cohort of patients with 10 different solid tumors progressing after standard therapy were selected. CfDNA analysis was performed using PAXgene blood ccfDNA tubes (QIAGEN), MagMAX cell-free total nucleic acid isolation kit, and ION PROTON platform (ThermoFisher Scientific). Results: Next generation sequencing analysis of 52 cancer-driver genes of cfDNA samples of 39 patients allowed for picking up clones plausibly involved in the PD mechanism in 60% of cases. A mean of 1.3 mutated genes (range 1-3) for each tumor was found. Point mutations in TP53, PIK3CA, and CNV in FGFR3 were the most commonly observed, with a rate of 41%, 16%, and 13%, respectively. Increased copy number variations of FGF receptors were identified in patients with non-small cell lung, pancreatic, and gastric cancer, and cholangiocarcinoma. Other clones had mutations in ESR1 (breast), CTNNB1 (uterus), KRAS and CCND2 (pancreas), EGFR and BRAF (lung). Interestingly, retinoblastomas resistant to Melphalan showed expanding mutated clones in PTEN or SMAD4. Increased levels of cfDNA were observed in the plasma of all patients. Conclusions: The results presented here show that irrespective of the primary tumor mutational burden and subsequent complex clonal evolution, a simplified mutational load is present at PD. One or few “sniper” clones drive progression and the molecular profile has a weak correlation with the primary tumor. Single driver mutations in TP53 remain the main target of a not yet developed specific therapy in most tumors such as breast, ovarian, uterine, lung, gastric cancers and glioblastoma. Among the actionable mutations, PIK3CA were found, not only in breast cancers, but also in uterine carcinoma, Sezary syndrome and glioblastoma, pinpointing the needs of specific trials in these tumors.


2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Sara Althubaiti ◽  
Andreas Karwath ◽  
Ashraf Dallol ◽  
Adeeb Noor ◽  
Shadi Salem Alkhayyat ◽  
...  

AbstractIdentifying and distinguishing cancer driver genes among thousands of candidate mutations remains a major challenge. Accurate identification of driver genes and driver mutations is critical for advancing cancer research and personalizing treatment based on accurate stratification of patients. Due to inter-tumor genetic heterogeneity many driver mutations within a gene occur at low frequencies, which make it challenging to distinguish them from non-driver mutations. We have developed a novel method for identifying cancer driver genes. Our approach utilizes multiple complementary types of information, specifically cellular phenotypes, cellular locations, functions, and whole body physiological phenotypes as features. We demonstrate that our method can accurately identify known cancer driver genes and distinguish between their role in different types of cancer. In addition to confirming known driver genes, we identify several novel candidate driver genes. We demonstrate the utility of our method by validating its predictions in nasopharyngeal cancer and colorectal cancer using whole exome and whole genome sequencing.


Sign in / Sign up

Export Citation Format

Share Document