scholarly journals Optimization of Enzymatic Fragmentation is Crucial To Maximize Genome Coverage: A Comparison of Library Preparation Methods for Illumina Sequencing

Author(s):  
Teodora Ribarska ◽  
Pål Marius Bjørnstad ◽  
Arvind Y.M. Sundaram ◽  
Gregor D. Gilfillan

Abstract Background Novel commercial kits for whole genome library preparation for next-generation sequencing on Illumina platforms promise shorter workflows, lower inputs and cost savings. Time savings are achieved by employing enzymatic DNA fragmentation and by combining end-repair and tailing reactions. Fewer cleanup steps also allow greater DNA input flexibility (1 ng-1 µg), PCR-free options from 100 ng DNA, and lower price as compared to the well-established sonication and tagmentation-based DNA library preparation kits. Results We compared the performance of four enzymatic fragmentation-based DNA library preparation kits (from New England Biolabs, Roche, Swift Biosciences and Quantabio) to a tagmentation-based kit (Illumina) using low input DNA amounts (10 ng) and PCR-free reactions with 100 ng DNA. With four technical replicates of each input amount and kit, we compared the kits` fragmentation sequence-bias as well as performance parameters such as sequence coverage and the clinically relevant detection of single nucleotide and indel variants. While all kits produced high quality sequence data and demonstrated similar performance, several enzymatic fragmentation methods produced library insert sizes which deviated from those intended. Libraries with longer insert lengths performed better in terms of coverage, SNV and indel detection. Lower performance of shorter-insert libraries could be explained by loss of sequence coverage to overlapping paired-end reads, exacerbated by the preferential sequencing of shorter fragments on Illumina sequencers. We also observed that libraries prepared with minimal or no PCR performed best with regard to indel detection. Conclusions The enzymatic fragmentation-based DNA library preparation kits from NEB, Roche, Swift and Quantabio are good alternatives to the tagmentation based Nextera DNA flex kit from Illumina, offering reproducible results using flexible DNA inputs, quick workflows and lower prices. Libraries with insert DNA fragments longer than the cumulative sum of both read lengths avoid read overlap, thus produce more informative data that leads to strongly improved genome coverage and consequently also increased sensitivity and precision of SNP and indel detection. In order to best utilize such enzymatic fragmentation reagents, researchers should be prepared to invest time to optimize fragmentation conditions for their particular samples.

Genes ◽  
2020 ◽  
Vol 11 (3) ◽  
pp. 332 ◽  
Author(s):  
Krishnamoorthy Srikanth ◽  
Jong-Eun Park ◽  
Dajeong Lim ◽  
Jihye Cha ◽  
Sang-Rae Cho ◽  
...  

Until recently, genome-scale phasing was limited due to the short read sizes of sequence data. Though the use of long-read sequencing can overcome this limitation, they require extensive error correction. The emergence of technologies such as 10X genomics linked read sequencing and Hi-C which uses short-read sequencers along with library preparation protocols that facilitates long-read assemblies have greatly reduced the complexities of genome scale phasing. Moreover, it is possible to accurately assemble phased genome of individual samples using these methods. Therefore, in this study, we compared three phasing strategies which included two sample preparation methods along with the Long Ranger pipeline of 10X genomics and HapCut2 software, namely 10X-LG, 10X-HapCut2, and HiC-HapCut2 and assessed their performance and accuracy. We found that the 10X-LG had the best phasing performance amongst the method analyzed. They had the highest phasing rate (89.6%), longest adjusted N50 (1.24 Mb), and lowest switch error rate (0.07%). Moreover, the phasing accuracy and yield of the 10X-LG stayed over 90% for distances up to 4 Mb and 550 Kb respectively, which were considerably higher than 10X-HapCut2 and Hi-C Hapcut2. The results of this study will serve as a good reference for future benchmarking studies and also for reference-based imputation in Hanwoo.


Author(s):  
John R Tyson ◽  
Phillip James ◽  
David Stoddart ◽  
Natalie Sparks ◽  
Arthur Wickenhagen ◽  
...  

AbstractGenome sequencing has been widely deployed to study the evolution of SARS-CoV-2 with more than 90,000 genome sequences uploaded to the GISAID database. We published a method for SARS-CoV-2 genome sequencing (https://www.protocols.io/view/ncov-2019-sequencing-protocol-bbmuik6w) online on January 22, 2020. This approach has rapidly become the most popular method for sequencing SARS-CoV-2 due to its simplicity and cost-effectiveness. Here we present improvements to the original protocol: i) an updated primer scheme with 22 additional primers to improve genome coverage, ii) a streamlined library preparation workflow which improves demultiplexing rate for up to 96 samples and reduces hands-on time by several hours and iii) cost savings which bring the reagent cost down to £10 per sample making it practical for individual labs to sequence thousands of SARS-CoV-2 genomes to support national and international genomic epidemiology efforts.


BMC Genomics ◽  
2014 ◽  
Vol 15 (1) ◽  
pp. 912 ◽  
Author(s):  
Adriana Alberti ◽  
Caroline Belser ◽  
Stéfan Engelen ◽  
Laurie Bertrand ◽  
Céline Orvain ◽  
...  

2014 ◽  
Vol 2014 ◽  
pp. 1-8
Author(s):  
Momchilo Vuyisich ◽  
Ayesha Arefin ◽  
Karen Davenport ◽  
Shihai Feng ◽  
Cheryl Gleasner ◽  
...  

Sequencing bacterial genomes has traditionally required large amounts of genomic DNA (~1 μg). There have been few studies to determine the effects of the input DNA amount or library preparation method on the quality of sequencing data. Several new commercially available library preparation methods enable shotgun sequencing from as little as 1 ng of input DNA. In this study, we evaluated the NEBNext Ultra library preparation reagents for sequencing bacterial genomes. We have evaluated the utility of NEBNext Ultra for resequencing andde novoassembly of four bacterial genomes and compared its performance with the TruSeq library preparation kit. The NEBNext Ultra reagents enable high quality resequencing andde novoassembly of a variety of bacterial genomes when using 100 ng of input genomic DNA. For the two most challenging genomes (Burkholderiaspp.), which have the highest GC content and are the longest, we also show that the quality of both resequencing andde novoassembly is not decreased when only 10 ng of input genomic DNA is used.


2021 ◽  
Author(s):  
Ryan O Schenck ◽  
Gabriel Brosula ◽  
Jeffrey West ◽  
Simon Leedham ◽  
Darryl Shibata ◽  
...  

Gattaca provides the first base-pair resolution artificial genomes for tracking somatic mutations within agent based modeling. Through the incorporation of human reference genomes, mutational context, sequence coverage/error information Gattaca is able to realistically provide comparable sequence data for in-silico comparative evolution studies with human somatic evolution studies. This user-friendly method, incorporated into each in-silico cell, allows us to fully capture somatic mutation spectra and evolution.


protocols.io ◽  
2021 ◽  
Author(s):  
Elias Dahdouh ◽  
Fernando Lázaro Perona ◽  
María Rodríguez Tejedor ◽  
Rubén Cáceres Sánchez ◽  
Iván Bloise Sánchez ◽  
...  

Genes ◽  
2020 ◽  
Vol 11 (8) ◽  
pp. 949 ◽  
Author(s):  
Sureshnee Pillay ◽  
Jennifer Giandhari ◽  
Houriiyah Tegally ◽  
Eduan Wilkinson ◽  
Benjamin Chimukangara ◽  
...  

The COVID-19 pandemic has spread very fast around the world. A few days after the first detected case in South Africa, an infection started in a large hospital outbreak in Durban, KwaZulu-Natal (KZN). Phylogenetic analysis of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) genomes can be used to trace the path of transmission within a hospital. It can also identify the source of the outbreak and provide lessons to improve infection prevention and control strategies. This manuscript outlines the obstacles encountered in order to genotype SARS-CoV-2 in near-real time during an urgent outbreak investigation. This included problems with the length of the original genotyping protocol, unavailability of reagents, and sample degradation and storage. Despite this, three different library preparation methods for Illumina sequencing were set up, and the hands-on library preparation time was decreased from twelve to three hours, which enabled the outbreak investigation to be completed in just a few weeks. Furthermore, the new protocols increased the success rate of sequencing whole viral genomes. A simple bioinformatics workflow for the assembly of high-quality genomes in near-real time was also fine-tuned. In order to allow other laboratories to learn from our experience, all of the library preparation and bioinformatics protocols are publicly available at protocols.io and distributed to other laboratories of the Network for Genomics Surveillance in South Africa (NGS-SA) consortium.


2010 ◽  
Vol 76 (12) ◽  
pp. 3863-3868 ◽  
Author(s):  
J. Kirk Harris ◽  
Jason W. Sahl ◽  
Todd A. Castoe ◽  
Brandie D. Wagner ◽  
David D. Pollock ◽  
...  

ABSTRACT Constructing mixtures of tagged or bar-coded DNAs for sequencing is an important requirement for the efficient use of next-generation sequencers in applications where limited sequence data are required per sample. There are many applications in which next-generation sequencing can be used effectively to sequence large mixed samples; an example is the characterization of microbial communities where ≤1,000 sequences per samples are adequate to address research questions. Thus, it is possible to examine hundreds to thousands of samples per run on massively parallel next-generation sequencers. However, the cost savings for efficient utilization of sequence capacity is realized only if the production and management costs associated with construction of multiplex pools are also scalable. One critical step in multiplex pool construction is the normalization process, whereby equimolar amounts of each amplicon are mixed. Here we compare three approaches (spectroscopy, size-restricted spectroscopy, and quantitative binding) for normalization of large, multiplex amplicon pools for performance and efficiency. We found that the quantitative binding approach was superior and represents an efficient scalable process for construction of very large, multiplex pools with hundreds and perhaps thousands of individual amplicons included. We demonstrate the increased sequence diversity identified with higher throughput. Massively parallel sequencing can dramatically accelerate microbial ecology studies by allowing appropriate replication of sequence acquisition to account for temporal and spatial variations. Further, population studies to examine genetic variation, which require even lower levels of sequencing, should be possible where thousands of individual bar-coded amplicons are examined in parallel.


Sign in / Sign up

Export Citation Format

Share Document