scholarly journals Optimized data representation and convolutional neural network model for predicting tumor purity

2019 ◽  
Author(s):  
Gerald J. Sun ◽  
David F. Jenkins ◽  
Pablo E. Cingolani ◽  
Jonathan R. Dry ◽  
Zhongwu Lai

AbstractHere we present a machine learning model, Deep Purity (DePuty) that leverages convolutional neural networks to accurately predict tumor purity from next-generation sequencing data from clinical samples without matched normals. As input, our model utilizes SNP-based copy number and minor allele frequency data formulated as a scatterplot image. With a representation matching that used by expert human annotators, we best an existing algorithm using only ~100 manually curated samples. Our simple, data-efficient approach can serve as a straightforward alternative to traditional, more complex statistical methods, for building performant purity prediction models that enable downstream bioinformatic analysis of tumor variants and absolute copy number alterations relevant to cancer genomics.

2017 ◽  
Vol 35 (15_suppl) ◽  
pp. e23108-e23108
Author(s):  
Stephen Lyle ◽  
Julie Y Tse ◽  
Ruobai Sun ◽  
Xiu Huang ◽  
Meaghan Russell ◽  
...  

e23108 Background: Immunotherapy is rapidly emerging as one of the most promising therapeutic options in clinical oncology. However, not all patients will respond to immune-oncology drugs and diagnostic assays are urgently needed to identify biomarkers that predict response to these therapies. We developed an assay that utilizes next generation sequencing data to simultaneously determine different types of somatic changes in tumors. The assay was designed, in part, to facilitate identification of cancer patients most likely to respond to immunotherapies by detecting 1) CD274 (PD-L1) copy number alterations, 2) viral infections (HPV16/18 and EBV), 3) microsatellite instability (MSI), and 4) tumor mutation burden. Methods: We developed a 435-gene panel assay (CancerPlex) with the goal of identifying all the genomic changes that inform treatment decisions with the highest possible accuracy. We thoroughly evaluated the performance of the assay using reference samples that represent all classes of genetic variation, including SNPs, indels, copy number changes, rearrangements, HPV/EBV infection and MSI. FFPE clinical samples were also evaluated to assess the ability of the assay to detect genetic variation in complex, heterogeneous tumors. Results: The assay has excellent performance, with a mean sensitivity of 99.4% and specificity of 99.9% for detecting somatic mutations with an allele fraction as low as 10%. The assay identified the MSI status of colorectal tumors with the same sensitivity as immunohistochemistry but with greater sensitivity than PCR. We also showed that calculating tumor mutation burden using the 435-gene panel predicts response to pembrolizumab as effectively as using whole exome sequencing. Among 892 patients across all tumor types, 6.8% were identified as candidates for immunotherapy based upon high tumor mutation burden and/or MSI status. Conclusions: The capacity of the 435-gene panel to determine all of the critical genetic changes, tumor mutation burden, MSI status, CD274 (PD-L1) CNVs, and HPV/EBV status has important ramifications for patient treatment strategies, including identification of patients who are more likely to benefit from immune checkpoint inhibitor therapies.


2020 ◽  
Author(s):  
Getiria Onsongo ◽  
Ham Ching Lam ◽  
Matthew Bower ◽  
Bharat Thyagarajan

Abstract Objective : Detection of small copy number variations (CNVs) in clinically relevant genes is routinely being used to aid diagnosis. We recently developed a tool, CNV-RF , capable of detecting small clinically relevant CNVs. CNV-RF was designed for small gene panels and did not scale well to large gene panels. On large gene panels, CNV-RF routinely failed due to memory limitations. When successful, it took about 2 days to complete a single analysis, making it impractical for routinely analyzing large gene panels. We need a reliable tool capable of detecting CNVs in the clinic that scales well to large gene panels. Results : We have developed Hadoop-CNV-RF, a scalable implementation of CNV-RF . Hadoop-CNV-RF is a freely available tool capable of rapidly analyzing large gene panels. It takes advantage of Hadoop, a big data framework developed to analyze large amounts of data. Preliminary results show it reduces analysis time from about 2 days to less than 4 hours and can seamlessly scale to large gene panels. Hadoop-CNV-RF has been clinically validated for targeted capture data and is currently being used in a CLIA molecular diagnostics laboratory. Its availability and usage instructions are publicly available at: https://github.com/getiria-onsongo/hadoop-cnvrf-public .


2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Kyoungyul Lee ◽  
Hyun Jeong Kim ◽  
Min Hye Jang ◽  
Sejoon Lee ◽  
Soomin Ahn ◽  
...  

AbstractChromosomal instability (CIN) is known to be associated with prognosis and treatment response in breast cancer. This study was conducted to determine whether copy number gain of centromere 17 (CEP17) reflects CIN, and to evaluate the prognostic and predictive value of CIN in breast cancer. CIN status was determined by summing copy number gains of four centromeric probes (CEP1, CEP8, CEP11, and CEP16) based on fluorescence in situ hybridization and CIN scores were calculated using next generation sequencing data. High CIN was associated with adverse clinicopatholgical parameters of breast cancer. Among them, positive HER2 status, high Ki-67 index and CEP17 copy number gain were found to be independent predictors of high CIN. High CIN was associated with poor clinical outcome of the patients in the whole group, as well as in luminal/HER2-negative and HER2-positive subtypes. CEP17 copy number was significantly higher in the high-CIN-score group than in the low-CIN-score group. A positive linear correlation between the mean CEP17 copy number and the CIN score was found. In conclusion, CEP17 copy number was confirmed as a useful predictor for CIN in breast cancer, and high CIN was revealed as an indicator of poor prognosis in breast cancer.


2012 ◽  
Vol 40 (9) ◽  
pp. e69-e69 ◽  
Author(s):  
Günter Klambauer ◽  
Karin Schwarzbauer ◽  
Andreas Mayr ◽  
Djork-Arné Clevert ◽  
Andreas Mitterecker ◽  
...  

2016 ◽  
Vol 54 (4) ◽  
pp. 980-987 ◽  
Author(s):  
Sarah Mollerup ◽  
Jens Friis-Nielsen ◽  
Lasse Vinner ◽  
Thomas Arn Hansen ◽  
Stine Raith Richter ◽  
...  

Propionibacterium acnesis the most abundant bacterium on human skin, particularly in sebaceous areas.P. acnesis suggested to be an opportunistic pathogen involved in the development of diverse medical conditions but is also a proven contaminant of human clinical samples and surgical wounds. Its significance as a pathogen is consequently a matter of debate. In the present study, we investigated the presence ofP. acnesDNA in 250 next-generation sequencing data sets generated from 180 samples of 20 different sample types, mostly of cancerous origin. The samples were subjected to either microbial enrichment, involving nuclease treatment to reduce the amount of host nucleic acids, or shotgun sequencing. We detected high proportions ofP. acnesDNA in enriched samples, particularly skin tissue-derived and other tissue samples, with the levels being higher in enriched samples than in shotgun-sequenced samples.P. acnesreads were detected in most samples analyzed, though the proportions in most shotgun-sequenced samples were low. Our results show thatP. acnescan be detected in practically all sample types when molecular methods, such as next-generation sequencing, are employed. The possibility of contamination from the patient or other sources, including laboratory reagents or environment, should therefore always be considered carefully whenP. acnesis detected in clinical samples. We advocate that detection ofP. acnesalways be accompanied by experiments validating the association between this bacterium and any clinical condition.


Sign in / Sign up

Export Citation Format

Share Document