scholarly journals Targeted genome fragmentation with CRISPR/Cas9 improves hybridization capture, reduces PCR bias, and enables efficient high-accuracy sequencing of small targets

2017 ◽  
Author(s):  
Daniela Nachmanson ◽  
Shenyi Lian ◽  
Elizabeth K. Schmidt ◽  
Michael J. Hipp ◽  
Kathryn T. Baker ◽  
...  

ABSTRACTCurrent next-generation sequencing techniques suffer from inefficient target enrichment and frequent errors. To address these issues, we have developed a targeted genome fragmentation approach based on CRISPR/Cas9 digestion. By designing all fragments to similar lengths, regions of interest can be size-selected prior to library preparation, increasing hybridization capture efficiency. Additionally, homogenous length fragments reduce PCR bias and maximize read usability. We combine this novel target enrichment approach with ultra-accurate Duplex Sequencing. The result, termed CRISPR-DS, is a robust targeted sequencing technique that overcomes the inherent challenges of small target enrichment and enables the detection of ultra-low frequency mutations with small DNA inputs.


2021 ◽  
Vol 3 (1) ◽  
Author(s):  
Gundula Povysil ◽  
Monika Heinzl ◽  
Renato Salazar ◽  
Nicholas Stoler ◽  
Anton Nekrutenko ◽  
...  

Abstract Duplex sequencing is currently the most reliable method to identify ultra-low frequency DNA variants by grouping sequence reads derived from the same DNA molecule into families with information on the forward and reverse strand. However, only a small proportion of reads are assembled into duplex consensus sequences (DCS), and reads with potentially valuable information are discarded at different steps of the bioinformatics pipeline, especially reads without a family. We developed a bioinformatics toolset that analyses the tag and family composition with the purpose to understand data loss and implement modifications to maximize the data output for the variant calling. Specifically, our tools show that tags contain polymerase chain reaction and sequencing errors that contribute to data loss and lower DCS yields. Our tools also identified chimeras, which likely reflect barcode collisions. Finally, we also developed a tool that re-examines variant calls from raw reads and provides different summary data that categorizes the confidence level of a variant call by a tier-based system. With this tool, we can include reads without a family and check the reliability of the call, that increases substantially the sequencing depth for variant calling, a particular important advantage for low-input samples or low-coverage regions.



2021 ◽  
Vol 9 (6) ◽  
pp. 651
Author(s):  
Yan Yan ◽  
Hongyan Xing

In order for the detection ability of floating small targets in sea clutter to be improved, on the basis of the complete ensemble empirical mode decomposition (CEEMD) algorithm, the high-frequency parts and low-frequency parts are determined by the energy proportion of the intrinsic mode function (IMF); the high-frequency part is denoised by wavelet packet transform (WPT), whereas the denoised high-frequency IMFs and low-frequency IMFs reconstruct the pure sea clutter signal together. According to the chaotic characteristics of sea clutter, we proposed an adaptive training timesteps strategy. The training timesteps of network were determined by the width of embedded window, and the chaotic long short-term memory network detection was designed. The sea clutter signals after denoising were predicted by chaotic long short-term memory (LSTM) network, and small target signals were detected from the prediction errors. The experimental results showed that the CEEMD-WPT algorithm was consistent with the target distribution characteristics of sea clutter, and the denoising performance was improved by 33.6% on average. The proposed chaotic long- and short-term memory network, which determines the training step length according to the width of embedded window, is a new detection method that can accurately detect small targets submerged in the background of sea clutter.



2021 ◽  
pp. jclinpath-2021-207421
Author(s):  
Frido K Bruehl ◽  
Erika E Doxtader ◽  
Yu-Wei Cheng ◽  
Daniel H Farkas ◽  
Carol Farver ◽  
...  

AimVarious approaches have been reported for distinguishing separate primary lung adenocarcinomas from intrapulmonary metastases in patients with two lung nodules. The aim of this study was to determine whether histological assessment is reliable and accurate in distinguishing separate primary lung adenocarcinomas from intrapulmonary metastases using routine molecular findings as an adjunct.MethodsWe studied resected tumour pairs from 32 patients with lung adenocarcinomas in different lobes. In 15 of 32 tumour pairs, next-generation sequencing (NGS) for common driver mutations was performed on both nodules. The remainder of tumour pairs underwent limited NGS, or EGFR genotyping. Tumour pairs with different drivers (or one driver/one wild-type) were classified as molecularly unrelated, while those with identical low-frequency drivers were classified as related. Three pathologists independently and blinded to the molecular results categorised tumour pairs as related or unrelated based on histological assessment.ResultsOf 32 pairs, 15 were classified as related by histological assessment, and 17 as unrelated. Of 15 classified as related by histology, 6 were classified as related by molecular analysis, 4 were unrelated and 5 were indeterminate. Of 17 classified as unrelated by histology, 14 were classified as unrelated by molecular analysis, none was related and 3 were indeterminate. Histological assessment of relatedness was inaccurate in 4/32 (12.5%) tumour pairs.ConclusionsA small but significant subset of two-nodule adenocarcinoma pairs is inaccurately judged as related by histological assessment, and can be proven to be unrelated by molecular analysis (driver gene mutations), leading to significant downstaging.



Blood ◽  
2020 ◽  
Vol 136 (Supplement 1) ◽  
pp. 31-32
Author(s):  
Jacob Higgins ◽  
Fang Yin Lo ◽  
Michael J. Hipp ◽  
Charles C. Valentine ◽  
Lindsey N. Williams ◽  
...  

Sensitive and specific detection of measurable residual disease (MRD) after treatment in pediatric acute myeloid leukemia (AML) is prognostic of relapse and is important for clinical decision making. Mutation-based methods are increasingly being used, but are hampered by the limited number of common driver gene mutations to target as clone markers. Additional targets would greatly increase MRD detection power. However, even in cases with many AML-defining mutations, it is the limited accuracy of current molecular methods which establishes the lower bounds of sensitivity. Here we describe an ultrasensitive approach for disease monitoring with personalized hybrid capture panels targeting hundreds of somatic mutations identified by whole genome sequencing (WGS), and using extremely accurate Duplex Sequencing (DS) in longitudinal samples. In a pilot cohort of 13 patients we demonstrate detection sensitivities several orders of magnitude beyond currently available single locus testing or less accurate sequencing. With multi-target panels, overall power for MRD detection is cumulative across sites. For example, if a patient has MRD at a true frequency of 1/30,000, sequencing a single mutant site to 10,000x molecular depth would be unlikely to detect MRD. However, sequencing 10 sites each to 10,000x would effectively total 100,000x informative site depth, increasing power to >95%. However, standard sequencing assays are insufficiently accurate to achieve this theoretical limit of detection (LOD). DS enables accurate detection of individual variants to <10-5 with an error rate <10-7 and, thus, can achieve MRD sensitivities below one-in-one-million. Marrow aspirates were collected from 13 uniformly treated pediatric AML patients at time of diagnosis (TOD), during treatment (end of induction, EOI), in remission (end of therapy, EOT), and at relapse. 9/13 patients relapsed. DNA from TOD was analyzed by WGS. Germline variants were excluded and somatic single nucleotide variants (SNVs) were targeted by a custom probe panel designed for each patient. An average of 170 SNVs were targeted per patient (range 53-200). More than 90% of the SNVs were noncoding. Longitudinal samples were then analyzed with DS, which compares sequences from both strands of each DNA molecule to eliminate technical noise and reveal biological mutation signal with extreme accuracy and sensitivity. A median of 82% of WGS SNVs were validated by DS in the TOD DNA, and the vast majority of those were also present at relapse. Relapsers had more SNVs at diagnosis than non-relapsers. EOT samples were sequenced to an average Duplex molecular depth of 29,400x, with a maximum of 61,283x. The figure below shows time course plots tracking SNVs at diagnosis, EOT and relapse for 2 patients. Among mutations validated in TOD samples, a median of only 8 (5%) were detected per EOT sample (range 0-66 mutations). MRD was detected in 8/9 relapsers. Targeting 1 or even 10 SNVs would therefore have missed MRD in the majority of these patients. Among relapsers, median EOT SNV VAF was 0.069%. The lowest single VAF detected per EOT sample ranged from 0.036% to 0.002%. The presence of an SNV at diagnosis and relapse implies that it must truly be present at EOT, whether or not it is detected. Therefore, if a small minority of leukemic mutations are detected at EOT, the true overall MRD frequency is much lower than the LOD at any single site. In the only relapser where MRD was not detected, targeted SNVs were present at diagnosis and relapse, so additional sequencing depth at EOT would eventually reveal ultra-low frequency mutations. Among patients that did not relapse by the end of the study, median VAF at EOI (the latest time point DNA available) was 0.0258%. Therefore, non-relapsers have a lower median VAF at EOI than relapsers do even later at EOT, potentially indicating very early on that treatment is more successful. This study shows excellent performance of DS-based assays for detecting MRD with patient-specific panels. We have demonstrated that among large panels of validated somatic SNVs present at time of diagnosis, a median of 5% are identified at EOT in eventual relapsers. DS detected MRD in 8/9 patients, and at a median VAF well below the limit of detection of any other sequencing technology. Comprehensive personalized hybrid selection panels coupled with DS represents a powerful option for MRD monitoring in pediatric AML and potentially other cancers. Figure Disclosures Higgins: TwinStrand Biosciences: Current Employment. Lo:TwinStrand Biosciences: Current Employment. Hipp:TwinStrand Biosciences: Current Employment. Valentine:TwinStrand Biosciences: Current Employment. Williams:TwinStrand Biosciences: Current Employment. Radich:TwinStrand Biosciences: Research Funding. Salk:TwinStrand Biosciences: Current Employment.



2019 ◽  
Author(s):  
Xinyue You ◽  
Suresh Thiruppathi ◽  
Weiying Liu ◽  
Yiyi Cao ◽  
Mikihiko Naito ◽  
...  

ABSTRACTTo improve the accuracy and the cost-efficiency of next-generation sequencing in ultralow-frequency mutation detection, we developed the Paired-End and Complementary Consensus Sequencing (PECC-Seq), a PCR-free duplex consensus sequencing approach. PECC-Seq employed shear points as endogenous barcodes to identify consensus sequences from the overlap in the shortened, complementary DNA strands-derived paired-end reads for sequencing error correction. With the high accuracy of PECC-Seq, we identified the characteristic base substitution errors introduced by the end-repair process of mechanical fragmentation-based library preparations, which were prominent at the terminal 6 bp of the library fragments in the 5’-NpCpA-3’ or 5’-NpCpT-3’ trinucleotide context. As demonstrated at the human genome scale (TK6 cells), after removing these potential end-repair artifacts from the terminal 6 bp, PECC-Seq could reduce the sequencing error frequency to mid-10−7 with a relatively low sequencing depth. For TA base pairs, the background error rate could be suppressed to mid-10−8. In mutagen-treated TK6, slight increases in mutagen treatment-related mutant frequencies could be detected, indicating the potential of PECC-Seq in detecting genome-wide ultra-rare mutations. In addition, our finding on the patterns of end-repair artifacts may provide new insights in further reducing technical errors not only for PECC-Seq, but also for other next-generation sequencing techniques.



2020 ◽  
Author(s):  
Anja Furtwängler ◽  
Judith Neukamm ◽  
Lisa Böhme ◽  
Ella Reiter ◽  
Melanie Vollstedt ◽  
...  

AbstractIn ancient DNA research, the degraded nature of the samples generally results in poor yields of highly fragmented DNA, and targeted DNA enrichment is thus required to maximize research outcomes. The three commonly used methods – (1) array-based hybridization capture and in-solution capture using either (2) RNA or (3) DNA baits – have different characteristics that may influence the capture efficiency, specificity, and reproducibility. Here, we compared their performance in enriching pathogen DNA of Mycobacterium leprae and Treponema pallidum of 11 ancient and 19 modern samples. We find that in-solution approaches are the most effective method in ancient and modern samples of both pathogens, and RNA baits usually perform better than DNA baits.Method summaryWe compared three targeted DNA enrichment strategies used in ancient DNA research for the specific enrichment of pathogen DNA regarding their efficiency, specificity, and reproducibility for ancient and modern Mycobacterium leprae and Treponema pallidum samples. Array-based capture and in-solution capture with RNA and DNA baits were all tested in three independent replicates.





2016 ◽  
Vol 16 (3) ◽  
pp. 357-372 ◽  
Author(s):  
Leomar Y. Ballester ◽  
Rajyalakshmi Luthra ◽  
Rashmi Kanagal-Shamanna ◽  
Rajesh R. Singh


BioTechniques ◽  
2020 ◽  
Vol 69 (6) ◽  
pp. 455-459
Author(s):  
Anja Furtwängler ◽  
Judith Neukamm ◽  
Lisa Böhme ◽  
Ella Reiter ◽  
Melanie Vollstedt ◽  
...  

In ancient DNA research, the degraded nature of the samples generally results in poor yields of highly fragmented DNA; targeted DNA enrichment is thus required to maximize research outcomes. The three commonly used methods – array-based hybridization capture and in-solution capture using either RNA or DNA baits – have different characteristics that may influence the capture efficiency, specificity and reproducibility. Here we compare their performance in enriching pathogen DNA of Mycobacterium leprae and Treponema pallidum from 11 ancient and 19 modern samples. We find that in-solution approaches are the most effective method in ancient and modern samples of both pathogens and that RNA baits usually perform better than DNA baits.



Sign in / Sign up

Export Citation Format

Share Document