scholarly journals A simple method to estimate the in-house limit of detection for genetic mutations with low allele frequencies in whole-exome sequencing analysis by next-generation sequencing

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Takumi Miura ◽  
Satoshi Yasuda ◽  
Yoji Sato

Abstract Background Next-generation sequencing (NGS) has profoundly changed the approach to genetic/genomic research. Particularly, the clinical utility of NGS in detecting mutations associated with disease risk has contributed to the development of effective therapeutic strategies. Recently, comprehensive analysis of somatic genetic mutations by NGS has also been used as a new approach for controlling the quality of cell substrates for manufacturing biopharmaceuticals. However, the quality evaluation of cell substrates by NGS largely depends on the limit of detection (LOD) for rare somatic mutations. The purpose of this study was to develop a simple method for evaluating the ability of whole-exome sequencing (WES) by NGS to detect mutations with low allele frequency. To estimate the LOD of WES for low-frequency somatic mutations, we repeatedly and independently performed WES of a reference genomic DNA using the same NGS platform and assay design. LOD was defined as the allele frequency with a relative standard deviation (RSD) value of 30% and was estimated by a moving average curve of the relation between RSD and allele frequency. Results Allele frequencies of 20 mutations in the reference material that had been pre-validated by droplet digital PCR (ddPCR) were obtained from 5, 15, 30, or 40 G base pair (Gbp) sequencing data per run. There was a significant association between the allele frequencies measured by WES and those pre-validated by ddPCR, whose p-value decreased as the sequencing data size increased. By this method, the LOD of allele frequency in WES with the sequencing data of 15 Gbp or more was estimated to be between 5 and 10%. Conclusions For properly interpreting the WES data of somatic genetic mutations, it is necessary to have a cutoff threshold of low allele frequencies. The in-house LOD estimated by the simple method shown in this study provides a rationale for setting the cutoff.

2016 ◽  
Vol 59 (2) ◽  
pp. 54-58 ◽  
Author(s):  
Martin Beránek ◽  
Igor Sirák ◽  
Milan Vošmik ◽  
Jiří Petera ◽  
Monika Drastíková ◽  
...  

The aims of the study were:i) to compare circulating tumor DNA (ctDNA) yields obtained by different manual extraction procedures,ii) to evaluate the addition of various carrier molecules into the plasma to improve ctDNA extraction recovery, andiii) to use next generation sequencing (NGS) technology to analyzeKRAS,BRAF, andNRASsomatic mutations in ctDNA from patients with metastatic colorectal cancer. Venous blood was obtained from patients who suffered from metastatic colorectal carcinoma. For plasma ctDNA extraction, the following carriers were tested: carrier RNA, polyadenylic acid, glycogen, linear acrylamide, yeast tRNA, salmon sperm DNA, and herring sperm DNA. Each extract was characterized by quantitative real-time PCR and next generation sequencing. The addition of polyadenylic acid had a significant positive effect on the amount of ctDNA eluted. The sequencing data revealed five cases of ctDNA mutated inKRASand one patient with aBRAFmutation. An agreement of 86% was found between tumor tissues and ctDNA. Testing somatic mutations in ctDNA seems to be a promising tool to monitor dynamically changing genotypes of tumor cells circulating in the body. The optimized process of ctDNA extraction should help to obtain more reliable sequencing data in patients with metastatic colorectal cancer.


2017 ◽  
Author(s):  
Junho Kim ◽  
Dachan Kim ◽  
Jae Seok Lim ◽  
Ju Heon Maeng ◽  
Hyeonju Son ◽  
...  

ABSTRACTAccurate genome-wide detection of somatic mutations with low variant allele frequency (VAF, <1%) has proven difficult, for which generalized, scalable methods are lacking. Herein, we describe a new computational method, called RePlow that we developed to detect low-VAF somatic mutations based on simple, library-level replicates for next-generation sequencing on any platform. Through joint analysis of replicates, RePlow is able to remove prevailing background errors in next-generation sequencing analysis, facilitating remarkable improvement in the detection accuracy for low-VAF somatic mutations (up to ∼99% reduction in false positives). The method was validated in independent cancer panel and brain tissue sequencing data. Our study suggests a new paradigm with which to exploit an overwhelming abundance of sequencing data for accurate variant detection.


2011 ◽  
Vol 12 (1) ◽  
Author(s):  
Su Yeon Kim ◽  
Kirk E Lohmueller ◽  
Anders Albrechtsen ◽  
Yingrui Li ◽  
Thorfinn Korneliussen ◽  
...  

2013 ◽  
Vol 22 (14) ◽  
pp. 3766-3779 ◽  
Author(s):  
Mathieu Gautier ◽  
Julien Foucaud ◽  
Karim Gharbi ◽  
Timothée Cézard ◽  
Maxime Galan ◽  
...  

2012 ◽  
Vol 13 (1) ◽  
pp. 8 ◽  
Author(s):  
Danny Challis ◽  
Jin Yu ◽  
Uday S Evani ◽  
Andrew R Jackson ◽  
Sameer Paithankar ◽  
...  

Blood ◽  
2019 ◽  
Vol 134 (Supplement_1) ◽  
pp. 945-945
Author(s):  
Caitlin O'Neill ◽  
Zarko Manojlovic ◽  
Ah-Reum Jeong ◽  
Yili Xu ◽  
Yuxin Jin ◽  
...  

Introduction: Idiopathic erythrocytosis (IE) is characterized by a persistently elevated hemoglobin, equivocal erythropoietin (EPO) levels, absence of janus kinase 2 (JAK2) mutations suggestive of polycythemia vera (PV) and no secondary cause. One study used a targeted 21 gene next-generation sequencing panel and identified novel variants in known erythrocytosis-related genes as well as novel genes associated with the oxygen-sensing pathway. However, expanded sequencing of blood and matched tissue samples in a large ethnically diverse group of IE patients has not been performed. Methods: All patients signed informed consent to participate in an observational study approved by the Institutional Review Board; they provide blood and buccal mucosa samples at study entry and at 24-month follow-up. Patients were enrolled if JAK2 testing and a complete work up for secondary causes was negative. They were required to have hemoglobin levels greater than 16 g/dL on two occasions or greater than 15 g/dL if undergoing phlebotomy. Our initial sequencing of 20 IE patients was performed utilizing high resolution whole-exome sequencing of circulating blood samples (disease) at a mean coverage of 390x and matched normal (buccal) samples at mean coverage of 300x. To stratify samples by genetic ancestry, we performed a population stratification principle component analysis (PCA) and STRUCTURE using Ancestry Informative Markers derived from 1K Genome Phase1_v3 Exome database. The primary in-silico analysis was performed on the baseline samples from treatment-naïve patients. The whole-exome data was generated in accordance to GATK's best practices with same filters applied as described by Exome Aggregation Consortium. The additional downstream in-silico paired analysis was performed using MutSig2.0 (Mutation Significance) algorithm to determine significant mutations and GISTIC (The Genomic Identification of Significant Targets in Cancer) to identify the significant copy number events, IPA (Ingenuity Pathway Analysis) to determine pathways along with other computational . Results: Median age at baseline was 52 years (range 35-71). Six patients (30%) were female and 14 (70%) were male. Median values and ranges for laboratory parameters at baseline were as follows: WBC 6.6 x 109/L (5-9.7), Hgb 17 g/dL (15.5-19.8), Plt 218 x 109/L (86-374), and EPO level 9.8 IU/L (2-14.3). Three patients had a personal history of malignancy, including 2 with lymphoma. Two patients had a family history of myeloid malignancy (chronic myeloid leukemia and PV). Our ancestry analysis of initial 20 patients with IE identified 6 patients with high European percent ancestry (EUR), 1 patient with high Asian percent ancestry (EAS) and 13 patients with high percent Ad Mixed ancestry (AMR). In our cohort, 60% (12/20) of patients had been also diagnosed with a liver disorder (11 with fatty liver, 1 with cirrhosis) that was not significantly different across populations. We identified, on average, 42 non-silent somatic mutations (not present in the buccal samples) in whole blood across our cohort with no statistical difference (p=0.671) in mutation burden between ancestry groups or between patients with and without liver disease. Age, gender, and ethnicity were not associated with mutation burden. Utilizing MutSig algorithm, we identified a novel candidate gene, CHAF1A, with high mutation prevalence of 30% in patients with IE. Further analysis of mutation landscape identified somatic nonsilent mutations in 25 known oncogenes which were present in at least 10% of patients. Our mutation signatures in IE identified a significant association with failure of double stranded DNA repair. Only one patient had a mutation in TET2. Further analysis of copy number indicated copy number loss in genes such as SETD3 and GSH associated with chromatin assembly which may suggest alterations in chromatin assembly and changes in the epigenome. Our analysis also identified a high number of 9p and 13q gains in patients with IE. Conclusion: In this study, we utilized high-resolution next generation sequencing in association with comprehensive clinical annotation to determine potential molecular drivers of IE in a multi-ethnic population. We identified somatic mutations in a subset of patients which may represent clonal hematopoiesis. Long term follow up of outcomes in this cohort may clarify the significance of these mutations in the pathogenesis of IE. Disclosures No relevant conflicts of interest to declare.


BioTechniques ◽  
2020 ◽  
Vol 69 (6) ◽  
pp. 420-426
Author(s):  
Natallia Kalinava ◽  
Abraham Apfel ◽  
Robert Cartmell ◽  
Sujaya Srinivasan ◽  
Ming-Shan Chien ◽  
...  

Although next-generation sequencing assays are routinely carried out using samples from cancer trials, the sequencing data are not always of the required quality. There is a need to evaluate the performance of tissue collection sites and provide feedback about the quality of next-generation sequencing data. This study used a modeling approach based on whole exome sequencing quality control (QC) metrics to evaluate the relative performance of sites participating in the Bristol Myers Squibb Immuno-Oncology clinical trials sample collection. We identified several events for the sample swap. Overall, most sites performed well and few showed poor performance. These findings can increase awareness of sample failure and improve the quality of samples.


2018 ◽  
Author(s):  
Jonas Meisner ◽  
Anders Albrechtsen

ABSTRACTWe here present two methods for inferring population structure and admixture proportions in low depth next generation sequencing data. Inference of population structure is essential in both population genetics and association studies and is often performed using principal component analysis or clustering-based approaches. Next-generation sequencing methods provide large amounts of genetic data but are associated with statistical uncertainty for especially low depth sequencing data. Models can account for this uncertainty by working directly on genotype likelihoods of the unobserved genotypes. We propose a method for inferring population structure through principal component analysis in an iterative approach of estimating individual allele frequencies, where we demonstrate improved accuracy in samples with low and variable sequencing depth for both simulated and real datasets. We also use the estimated individual allele frequencies in a fast non-negative matrix factorization method to estimate admixture proportions. Both methods have been implemented in the PCAngsd framework available at http://www.popgen.dk/software/.


Sign in / Sign up

Export Citation Format

Share Document