scholarly journals Achieving robust somatic mutation detection with deep learning models derived from reference data sets of a cancer sample

2022 ◽  
Vol 23 (1) ◽  
Author(s):  
Sayed Mohammad Ebrahim Sahraeian ◽  
Li Tai Fang ◽  
Konstantinos Karagiannis ◽  
Malcolm Moos ◽  
Sean Smith ◽  
...  

Abstract Background Accurate detection of somatic mutations is challenging but critical in understanding cancer formation, progression, and treatment. We recently proposed NeuSomatic, the first deep convolutional neural network-based somatic mutation detection approach, and demonstrated performance advantages on in silico data. Results In this study, we use the first comprehensive and well-characterized somatic reference data sets from the SEQC2 consortium to investigate best practices for using a deep learning framework in cancer mutation detection. Using the high-confidence somatic mutations established for a cancer cell line by the consortium, we identify the best strategy for building robust models on multiple data sets derived from samples representing real scenarios, for example, a model trained on a combination of real and spike-in mutations had the highest average performance. Conclusions The strategy identified in our study achieved high robustness across multiple sequencing technologies for fresh and FFPE DNA input, varying tumor/normal purities, and different coverages, with significant superiority over conventional detection approaches in general, as well as in challenging situations such as low coverage, low variant allele frequency, DNA damage, and difficult genomic regions

2019 ◽  
Author(s):  
Sayed Mohammad Ebrahim Sahraeian ◽  
Li Tai Fang ◽  
Marghoob Mohiyuddin ◽  
Huixiao Hong ◽  
Wenming Xiao

AbstractAccurate detection of somatic mutations is challenging but critical to the understanding of cancer formation, progression, and treatment. We recently proposed NeuSomatic, the first deep convolutional neural network based somatic mutation detection approach and demonstrated performance advantages on in silico data. In this study, we used the first comprehensive and well-characterized somatic reference samples from the SEQC-II consortium to investigate best practices for utilizing deep learning framework in cancer mutation detection. Using the high-confidence somatic mutations established for these reference samples by the consortium, we identified strategies for building robust models on multiple datasets derived from samples representing real scenarios. The proposed strategies achieved high robustness across multiple sequencing technologies such as WGS, WES, AmpliSeq target sequencing for fresh and FFPE DNA input, varying tumor/normal purities, and different coverages (ranging from 10× - 2000×). NeuSomatic significantly outperformed conventional detection approaches in general, as well as in challenging situations such as low coverage, low mutation frequency, DNA damage, and difficult genomic regions.


2021 ◽  
Author(s):  
Sylvain Schmitt ◽  
Thibault Leroy ◽  
Myriam Heuertz ◽  
Niklas Tysklind

Mutation, the source of genetic diversity, is the raw material of evolution; however, it remains an understudied process in plants. Using simulations, we demonstrate that generic variant callers, commonly used to detect mutations in plants, are outperformed by methods developed for cancer research. Reanalysis of published data identified up to 7x more somatic mutations than initially reported, advocating the use of cancer research callers to boost mutation research in plants.


2012 ◽  
Vol 132 (12) ◽  
pp. 2858-2866 ◽  
Author(s):  
Jinyin Zhao ◽  
Feifei Xie ◽  
Wei Zhong ◽  
Weili Wu ◽  
Shoufang Qu ◽  
...  

2020 ◽  
Author(s):  
Reenu Anne Joy ◽  
Sukrishna Kamalasanan Thelakkattusserry ◽  
Narendranath Vikkath ◽  
Renjitha Bhaskaran ◽  
Damodaran Vasudevan ◽  
...  

Abstract Background: High resolution melting curve analysis is a cost-effective rapid screening method for detection of somatic gene mutation. The performance characteristics of this technique has been explored previously, however, analytical parameters such as limit of detection of mutant allele fraction and total concentration of DNA, have not been addressed. The current study focuses on comparing the mutation detection efficiency of High-Resolution Melt Analysis (HRM) with Sanger Sequencing in somatic mutations of the EGFR gene in non-small cell lung cancer .Methods: The minor allele fraction of somatic mutations was titrated against total DNA concentration using Sanger sequencing and HRM to determine the limit of detection. The mutant and wildtype allele fractions were validated by multiplex allele-specific real-time PCR. Somatic mutation detection efficiency, for exons 19 & 21 of the EGFR gene, was compared in 116 formalin fixed paraffin embedded tumor tissues, after screening 275 tumor tissues by Sanger sequencing.Results: The limit of detection of minor allele fraction of exon 19 mutation was 1% with Sequencing, and 0.25% with HRM, whereas for exon 21 mutation, 0.25% MAF was detected using both methods. Multiplex allele-specific real-time PCR revealed that the wildtype DNA did not impede the amplification of mutant allele in mixed DNA assays. All mutation positive samples detected by Sanger sequencing, were also detected by HRM. About 28% cases in exon 19 and 40% in exon 21, detected as mutated in HRM, were not detected by sequencing. Overall, sensitivity and specificity of HRM were found to be 100% and 67% respectively, and the negative predictive value was 100%, while positive predictive value was 80%. Conclusion: The comparative series study suggests that HRM is a modest initial screening test for somatic mutation detection of EGFR, which must further be confirmed by Sanger sequencing. With the modification of annealing temperature of initial PCR, the limit of detection of Sanger sequencing can be improved.


2015 ◽  
pp. 321-341
Author(s):  
Catherine E. Cottrell ◽  
Andrew J. Bredemeyer ◽  
Hussam Al-Kateb

BMC Cancer ◽  
2020 ◽  
Vol 20 (1) ◽  
Author(s):  
Reenu Anne Joy ◽  
Sukrishna Kamalasanan Thelakkattusserry ◽  
Narendranath Vikkath ◽  
Renjitha Bhaskaran ◽  
Sajitha Krishnan ◽  
...  

Abstract Background High resolution melting curve analysis is a cost-effective rapid screening method for detection of somatic gene mutation. The performance characteristics of this technique has been explored previously, however, analytical parameters such as limit of detection of mutant allele fraction and total concentration of DNA, have not been addressed. The current study focuses on comparing the mutation detection efficiency of High-Resolution Melt Analysis (HRM) with Sanger Sequencing in somatic mutations of the EGFR gene in non-small cell lung cancer. Methods The minor allele fraction of somatic mutations was titrated against total DNA concentration using Sanger sequencing and HRM to determine the limit of detection. The mutant and wildtype allele fractions were validated by multiplex allele-specific real-time PCR. Somatic mutation detection efficiency, for exons 19 & 21 of the EGFR gene, was compared in 116 formalin fixed paraffin embedded tumor tissues, after screening 275 tumor tissues by Sanger sequencing. Results The limit of detection of minor allele fraction of exon 19 mutation was 1% with sequencing, and 0.25% with HRM, whereas for exon 21 mutation, 0.25% MAF was detected using both methods. Multiplex allele-specific real-time PCR revealed that the wildtype DNA did not impede the amplification of mutant allele in mixed DNA assays. All mutation positive samples detected by Sanger sequencing, were also detected by HRM. About 28% cases in exon 19 and 40% in exon 21, detected as mutated in HRM, were not detected by sequencing. Overall, sensitivity and specificity of HRM were found to be 100 and 67% respectively, and the negative predictive value was 100%, while positive predictive value was 80%. Conclusion The comparative series study suggests that HRM is a modest initial screening test for somatic mutation detection of EGFR, which must further be confirmed by Sanger sequencing. With the modification of annealing temperature of initial PCR, the limit of detection of Sanger sequencing can be improved.


2021 ◽  
Vol 23 (1) ◽  
pp. 29-37
Author(s):  
Scott C. Smith ◽  
Midhat S. Farooqi ◽  
Melissa A. Gener ◽  
Kevin Ginn ◽  
Julie M. Joyce ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document