scholarly journals The impact of DNA polymerase and number of rounds of amplification in PCR on 16S rRNA gene sequence data

2019 ◽  
Author(s):  
Marc A Sze ◽  
Patrick D Schloss

AbstractPCR amplification of 16S rRNA genes is a critical, yet under appreciated step in the generation of sequence data to describe the taxonomic composition of microbial communities. Numerous factors in the design of PCR can impact the sequencing error rate, the abundance of chimeric sequences, and the degree to which the fragments in the product represent their abundance in the original sample (i.e. bias). We compared the performance of high fidelity polymerases and varying number of rounds of amplification when amplifying a mock community and human stool samples. Although it was impossible to derive specific recommendations, we did observe general trends. Namely, using a polymerase with the highest possible fidelity and minimizing the number of rounds of PCR reduced the sequencing error rate, fraction of chimeric sequences, and bias. Evidence of bias at the sequence level was subtle and could not be ascribed to the fragments’ fraction of bases that were guanines or cytosines. When analyzing mock community data, the amount that the community deviated from the expected composition increased with rounds of PCR. This bias was inconsistent for human stool samples. Overall the results underscore the difficulty of comparing sequence data that are generated by different PCR protocols. However, the results indicate that the variation in human stool samples is generally larger than that introduced by the choice of polymerase or number of rounds of PCR.ImportanceA steep decline in sequencing costs drove an explosion in studies characterizing microbial communities from diverse environments. Although a significant amount of effort has gone into understanding the error profiles of DNA sequencers, little has been done to understand the downstream effects of the PCR amplification protocol. We quantified the effects of the choice of polymerase and number of PCR cycles on the quality of downstream data. We found that these choices can have a profound impact on the way that a microbial community is represented in the sequence data. The effects are relatively small compared to the variation in human stool samples, however, care should be taken to use polymerases with the highest possible fidelity and to minimize the number of rounds of PCR. These results also underscore that it is not possible to directly compare sequence data generated under different PCR conditions.

mSphere ◽  
2019 ◽  
Vol 4 (3) ◽  
Author(s):  
Marc A. Sze ◽  
Patrick D. Schloss

ABSTRACTPCR amplification of 16S rRNA genes is a critical yet underappreciated step in the generation of sequence data to describe the taxonomic composition of microbial communities. Numerous factors in the design of PCR can impact the sequencing error rate, the abundance of chimeric sequences, and the degree to which the fragments in the product represent their abundance in the original sample (i.e., bias). We compared the performance of high fidelity polymerases and various numbers of rounds of amplification when amplifying a mock community and human stool samples. Although it was impossible to derive specific recommendations, we did observe general trends. Namely, using a polymerase with the highest possible fidelity and minimizing the number of rounds of PCR reduced the sequencing error rate, fraction of chimeric sequences, and bias. Evidence of bias at the sequence level was subtle and could not be ascribed to the fragments’ fraction of bases that were guanines or cytosines. When analyzing mock community data, the amount that the community deviated from the expected composition increased with the number of rounds of PCR. This bias was inconsistent for human stool samples. Overall, the results underscore the difficulty of comparing sequence data that are generated by different PCR protocols. However, the results indicate that the variation in human stool samples is generally larger than that introduced by the choice of polymerase or number of rounds of PCR.IMPORTANCEA steep decline in sequencing costs drove an explosion in studies characterizing microbial communities from diverse environments. Although a significant amount of effort has gone into understanding the error profiles of DNA sequencers, little has been done to understand the downstream effects of the PCR amplification protocol. We quantified the effects of the choice of polymerase and number of PCR cycles on the quality of downstream data. We found that these choices can have a profound impact on the way that a microbial community is represented in the sequence data. The effects are relatively small compared to the variation in human stool samples; however, care should be taken to use polymerases with the highest possible fidelity and to minimize the number of rounds of PCR. These results also underscore that it is not possible to directly compare sequence data generated under different PCR conditions.


2018 ◽  
Author(s):  
Alexandra Perras ◽  
Kaisa Koskinen ◽  
Maximilian Mora ◽  
Michael Beck ◽  
Lisa Wink ◽  
...  

AbstractThe gut microbiome is strongly interwoven with human health. Conventional gut microbiome analysis generally involves 16S rRNA gene targeting next generation sequencing (NGS) of stool microbial communities, and correlation of results with clinical parameters. However, some microorganisms may not be alive at the time of sampling, and thus their impact on the human health is potentially less significant. As conventional NGS methods do not differentiate between viable and dead microbial components, retrieved results provide only limited information.Propidium monoazide (PMA) is frequently used in food safety monitoring and other disciplines to discriminate living from dead cells. PMA binds to free DNA and masks it for subsequent procedures. In this article we show the impact of PMA on the results of 16S rRNA gene-targeting NGS from human stool samples and validate the optimal applicable concentration to achieve a reliable detection of the living microbial communities.Fresh stool samples were treated with a concentration series of zero to 300 μM PMA, and were subsequently subjected to amplicon-based NGS. The results indicate that a substantial proportion of the human microbial community is not intact at the time of sampling. PMA treatment significantly reduced the diversity and richness of the sample depending on the concentration and impacted the relative abundance of certain important microorganisms (e.g. Akkermansia, Bacteroides). Overall, we found that a concentration of 100 μM PMA was sufficient to quench signals from disrupted microbial cells.The optimized protocol proposed here can be easily implemented in classical microbiome analyses, and helps to retrieve an improved and less blurry picture of the microbial community composition by excluding signals from background DNA.


Author(s):  
Patrick D Schloss ◽  
Sarah L Westcott ◽  
Matthew L Jenior ◽  
Sarah K Highlander

Over the past 10 years, microbial ecologists have largely abandoned sequencing 16S rRNA genes by the Sanger sequencing method and have instead adopted highly parallelized sequencing platforms. These new platforms, such as 454 and Illumina's MiSeq, have allowed researchers to obtain millions of high quality, but short sequences. These platforms have allowed researchers to significantly improve the design of their experiments. The tradeoff has been the decline in the number of full-length reference sequences that are deposited into databases. To overcome this problem, we tested the ability of the PacBio Single Molecule, Real-Time (SMRT) DNA sequencing platform to generate sequence reads from the 16S rRNA gene. We generated sequencing data from the V4, V3-V5, V1-V3, V1-V6, and V1-V9 variable regions from within the 16S rRNA gene from a synthetic mock community and natural samples collected from human feces, mouse feces, and soil. The synthetic mock community allowed us to assess the actual sequencing error rate and how that error rate changed when different curation methods were applied. We developed a simple method based on sequence characteristics and quality scores to reduce the observed error rate for the V1-V9 region from 2.16% to 0.32%. Unfortunately, this error rate was still 16-times higher than the error rate that has been observed for the shorter reads generated by 454 and Illumina's MiSeq sequencing platforms. Although the longer reads frequently provided better classification, the wider adoption of this approach for 16S rRNA gene sequencing is likely limited by its high sequencing error and low yield of sequencing data relative to the other available platforms.


2020 ◽  
Vol 98 (Supplement_4) ◽  
pp. 81-82
Author(s):  
Joaquim Casellas ◽  
Melani Martín de Hijas-Villalba ◽  
Marta Vázquez-Gómez ◽  
Samir Id Lahoucine

Abstract Current European regulations for autochthonous livestock breeds put a special emphasis on pedigree completeness, which requires laboratory paternity testing by genetic markers in most cases. This entails significant economic expenditure for breed societies and precludes other investments in breeding programs, such as genomic evaluation. Within this context, we developed paternity testing through low-coverage whole-genome data in order to reuse these data for genomic evaluation at no cost. Simulations relied on diploid genomes composed by 30 chromosomes (100 cM each) with 3,000,000 SNP per chromosome. Each population evolved during 1,000 non-overlapping generations with effective size 100, mutation rate 10–4, and recombination by Kosambi’s function. Only those populations with 1,000,000 ± 10% polymorphic SNP per chromosome in generation 1,000 were retained for further analyses, and expanded to the required number of parents and offspring. Individuals were sequenced at 0.01, 0.05, 0.1, 0.5 and 1X depth, with 100, 500, 1,000 or 10,000 base-pair reads and by assuming a random sequencing error rate per SNP between 10–2 and 10–5. Assuming known allele frequencies in the population and sequencing error rate, 0.05X depth sufficed to corroborate the true father (85,0%) and to discard other candidates (96,3%). Those percentages increased up to 99,6% and 99,9% with 0,1X depth, respectively (read length = 10,000 bp; smaller read lengths slightly improved the results because they increase the number of sequenced SNP). Results were highly sensitive to biases in allele frequencies and robust to inaccuracies regarding sequencing error rate. Low-coverage whole-genome sequencing data could be subsequently integrated into genomic BLUP equations by appropriately constructing the genomic relationship matrix. This approach increased the correlation between simulated and predicted breeding values by 1.21% (h2 = 0.25; 100 parents and 900 offspring; 0.1X depth by 10,000 bp reads). Although small, this increase opens the door to genomic evaluation in local livestock breeds.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Lukas Hafner ◽  
Maxime Pichon ◽  
Christophe Burucoa ◽  
Sophie H. A. Nusser ◽  
Alexandra Moura ◽  
...  

AbstractListeria genus comprises two pathogenic species, L. monocytogenes (Lm) and L. ivanovii, and non-pathogenic species. All can thrive as saprophytes, whereas only pathogenic species cause systemic infections. Identifying Listeria species’ respective biotopes is critical to understand the ecological contribution of Listeria virulence. In order to investigate the prevalence and abundance of Listeria species in various sources, we retrieved and analyzed 16S rRNA datasets from MG-RAST metagenomic database. 26% of datasets contain Listeria sensu stricto sequences, and Lm is the most prevalent species, most abundant in soil and host-associated environments, including 5% of human stools. Lm is also detected in 10% of human stool samples from an independent cohort of 900 healthy asymptomatic donors. A specific microbiota signature is associated with Lm faecal carriage, both in humans and experimentally inoculated mice, in which it precedes Lm faecal carriage. These results indicate that Lm faecal carriage is common and depends on the gut microbiota, and suggest that Lm faecal carriage is a crucial yet overlooked consequence of its virulence.


2020 ◽  
Author(s):  
Li Hou ◽  
Yadong Wang

Abstract BackgroundIn recent years, because of the development of sequencing technology, long reads were widely used in many studies, include transcriptomics studies. Obviously, Long reads have more advantages than short reads. And long reads align also different from short reads align. Until now Lots of tools can process long RNA-Seq, but there still have some problems need to solve. ResultsWe developed Deep-Long to process long RNA-Seq, Deep-Long is a fast and accurate tool. Deep-Long can handle troubles come from complicated gene structures and sequencing errors well, Deep-Long does well especially on alternative splicing and small exons. When sequencing error rate is low, Deep-Long can rapidly get more accurate results. While sequencing error rate rising, Deep-Long will use more time, but still more fast and accurate than most other tools.ConclusionsDeep-Long is an useful tool to align long RNA-Seq to genome, and Deep-Long can find more exons and splices.


Author(s):  
V. N. Agi ◽  
C. A. Azike

Background: The microbial ecosystem in the human intestine is complex and it plays a great role in health and nutrition. Cultural techniques have been used over the years to study the gut microbiota but studies suggest that a greater percentage of these bacteria found in the gut cannot be cultivated using the conventional methods of bacteria isolation. Aim: To increase understanding in this area, we characterized the bacterial diversity (both cultivated and non-cultivated bacteria) in the gut of diarrhoeic individuals using 16S rRNA gene (rDNA) sequences. Methodology: PCR amplification, sequencing and phylogenetic analysis of the 16S ribosomal DNA (rDNA) sequences were done on 10 diarrhoeic stool samples. Results: After quality filtering and chimeric sequence removal, 72313 sequences from all 10 diarrhoeic stool samples subjected to clustering generated 2767 Operational Taxonomic Units (OTUs) of which 2073 were new and unassigned. Representative sequences of the bacteria OTUs cluster were used to construct a bacteria phylogenetic tree which revealed a wide variety of bacteria Firmicutes, Bacteroidetes, Proteobacteria, Actinobacteria, Tenericutes and Cyanobacteria and others that could not be detected using the cultural techniques. The evolutionary relationship of the most abundant organisms and their contributions from each sample revealed the phylum Firmicutes to be most abundant and therefore have contributed most in the samples followed by Bacteroidetes. Fewer contributions were made by the other phyla Proteobacteria, Actinobacteria, Tenericutes and Cyanobacteria. Conclusion: This study was able to identify culturable and unculturable bacteria in the gut of diarrhoeic people in Rivers state and also show the biodiversity and interrelatedness of these microorganisms using molecular methods. Therefore, we can say that 16S rRNA techniques for detection and identification of predominant bacteria create new opportunities for non-cultivation studies of the human intestinal microflora, proper diagnosis of infectious diseases and new methods of treatments of diseases.


Author(s):  
Patrick D Schloss ◽  
Matthew L Jenior ◽  
Charles C. Koumpouras ◽  
Sarah L Westcott ◽  
Sarah K Highlander

Over the past 10 years, microbial ecologists have largely abandoned sequencing 16S rRNA genes by the Sanger sequencing method and have instead adopted highly parallelized sequencing platforms. These new platforms, such as 454 and Illumina's MiSeq, have allowed researchers to obtain millions of high quality, but short sequences. The result of the added sequencing depth has been significant improvements in experimental design. The tradeoff has been the decline in the number of full-length reference sequences that are deposited into databases. To overcome this problem, we tested the ability of the PacBio Single Molecule, Real-Time (SMRT) DNA sequencing platform to generate sequence reads from the 16S rRNA gene. We generated sequencing data from the V4, V3-V5, V1-V3, V1-V5, V1-V6, and V1-V9 variable regions from within the 16S rRNA gene using DNA from a synthetic mock community and natural samples collected from human feces, mouse feces, and soil. The mock community allowed us to assess the actual sequencing error rate and how that error rate changed when different curation methods were applied. We developed a simple method based on sequence characteristics and quality scores to reduce the observed error rate for the V1-V9 region from 0.69 to 0.027%. This error rate is comparable to what has been observed for the shorter reads generated by 454 and Illumina's MiSeq sequencing platforms. Although the per base sequencing cost is still significantly more than that of MiSeq, the prospect of supplementing reference databases with full-length sequences from organisms below the limit of detection from the Sanger approach is exciting.


2002 ◽  
Vol 83 (9) ◽  
pp. 2215-2223 ◽  
Author(s):  
A. C. Potgieter ◽  
A. D. Steele ◽  
A. A. van Dijk

Cloning full-length large (>3 kb) dsRNA genome segments from small amounts of dsRNA has thus far remained problematic. Here, a single-primer amplification sequence-independent dsRNA cloning procedure was perfected for large genes and tailored for routine use to clone complete genome sets or individual genes. Nine complete viral genome sets were amplified by PCR, namely those of two human rotaviruses, two African horsesickness viruses (AHSV), two equine encephalosis viruses (EEV), one bluetongue virus (BTV), one reovirus and bacteriophage Φ12. Of these amplified genomes, six complete genome sets were cloned for viruses with genes ranging in size from 0·8 to 6·8 kb. Rotavirus dsRNA was extracted directly from stool samples. Co-expressed EEV VP3 and VP7 assembled into core-like particles that have typical orbivirus capsomeres. This work presents the first EEV sequence data and establishes that EEV genes have the same conserved termini (5′ GUU and UAC 3′) and coding assignment as AHSV and BTV. To clone complete genome sets, one-tube reactions were developed for oligo-ligation, cDNA synthesis and PCR amplification. The method is simple and efficient compared to other methods. Complete genomes can be cloned from as little as 1 ng dsRNA and a considerably reduced number of PCR cycles (22–30 cycles compared to 30–35 of other methods). This progress with cloning large dsRNA genes is important for recombinant vaccine development and determination of the role of terminal sequences for replication and gene expression.


Sign in / Sign up

Export Citation Format

Share Document