scholarly journals Fast and accurate average genome size and 16S rRNA gene average copy number computation in metagenomic data

2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Emiliano Pereira-Flores ◽  
Frank Oliver Glöckner ◽  
Antonio Fernandez-Guerra
2021 ◽  
Author(s):  
Yingnan Gao ◽  
Martin Wu

Background: 16S rRNA gene has been widely used in microbial diversity studies to determine the community composition and structure. 16S rRNA gene copy number (16S GCN) varies among microbial species and this variation introduces biases to the relative cell abundance estimated using 16S rRNA read counts. To correct the biases, methods (e.g., PICRUST2) have been developed to predict 16S GCN. 16S GCN predictions come with inherent uncertainty, which is often ignored in the downstream analyses. However, a recent study suggests that the uncertainty can be so great that copy number correction is not justified in practice. Despite the significant implications in 16S rRNA based microbial diversity studies, the uncertainty associated with 16S GCN predictions has not been well characterized and its impact on microbial diversity studies needs to be investigated. Results: Here we develop RasperGade16S, a novel method and software to better model and capture the inherent uncertainty in 16S rRNA GCN prediction. RasperGade16S implements a maximum likelihood framework of pulsed evolution model and explicitly accounts for intraspecific GCN variation and heterogeneous GCN evolution rates among species. Using cross validation, we show that our method provides robust confidence estimates for the GCN predictions and outperforms PICRUST2 in both precision and recall. We have predicted GCN for 592605 OTUs in the SILVA database and tested 113842 bacterial communities that represent an exhaustive and diverse list of engineered and natural environments. We found that the prediction uncertainty is small enough for 99% of the communities that 16S GCN correction should improve their compositional and functional profiles estimated using 16S rRNA reads. On the other hand, we found that GCN variation has limited impacts on beta-diversity analyses such as PCoA, PERMANOVA and random forest test. Conclusion: We have developed a method to accurately account for uncertainty in 16S rRNA GCN predictions and the downstream analyses. For almost all 16S rRNA surveyed bacterial communities, correction of 16S GCN should improve the results when estimating their compositional and functional profiles. However, such correction is not necessary for beta-diversity analyses.


2016 ◽  
Author(s):  
Tatsuhiko Hoshino ◽  
Fumio Inagaki

AbstractNext-generation sequencing (NGS) is a powerful tool for analyzing environmental DNA and provides the comprehensive molecular view of microbial communities. For obtaining the copy number of particular sequences in the NGS library, however, additional quantitative analysis as quantitative PCR (qPCR) or digital PCR (dPCR) is required. Furthermore, number of sequences in a sequence library does not always reflect the original copy number of a target gene because of biases caused by PCR amplification, making it difficult to convert the proportion of particular sequences in the NGS library to the copy number using the mass of input DNA. To address this issue, we applied stochastic labeling approach with random-tag sequences and developed a NGS-based quantification protocol, which enables simultaneous sequencing and quantification of the targeted DNA. This quantitative sequencing (qSeq) is initiated from single-primer extension (SPE) using a primer with random tag adjacent to the 5’ end of target-specific sequence. During SPE, each DNA molecule is stochastically labeled with the random tag. Subsequently, first-round PCR is conducted, specifically targeting the SPE product, followed by second-round PCR to index for NGS. The number of random tags is only determined during the SPE step and is therefore not affected by the two rounds of PCR that may introduce amplification biases. In the case of 16S rRNA genes, after NGS sequencing and taxonomic classification, the absolute number of target phylotypes 16S rRNA gene can be estimated by Poisson statistics by counting random tags incorporated at the end of sequence. To test the feasibility of this approach, the 16S rRNA gene of Sulfolobus tokodaii was subjected to qSeq, which resulted in accurate quantification of 5.0 × 103to 5.0 × 104copies of the 16S rRNA gene. Furthermore, qSeq was applied to mock microbial communities and environmental samples, and the results were comparable to those obtained using digital PCR and relative abundance based on a standard sequence library. We demonstrated that the qSeq protocol proposed here is advantageous for providing less-biased absolute copy numbers of each target DNA with NGS sequencing at one time. By this new experiment scheme in microbial ecology, microbial community compositions can be explored in more quantitative manner, thus expanding our knowledge of microbial ecosystems in natural environments.


2000 ◽  
Vol 46 (11) ◽  
pp. 1082-1086 ◽  
Author(s):  
Joanne B Messick ◽  
Geoffrey Smith ◽  
Linda Berent ◽  
Sandra Cooper

The genome size of Eperythrozoon suis, an unculturable haemotropic mycoplasma, was estimated using pulsed-field gel electrophoresis (PFGE). Gamma irradiation was used to introduce one (on the average) double-strand break in the E. suis Illinois chromosome. Restriction enzymes that cut infrequently were also used to analyze genome size. The size estimate for the full-length genome was 745 kilobases (kb), whereas the size estimates based on the summation of restriction fragments ranged from 730 to 770 kb. The 16S rRNA gene was located on the 120-kb MluI fragment, 128-kb NruI fragment, 25-kb SacII fragment, and 217-kb SalI fragment by Southern blotting.Key words: Eperythrozoon suis, 16S rRNA, Mycoplasma pneumoniae group, pulsed-field gel electrophoresis, genome size.


2000 ◽  
Vol 46 (11) ◽  
pp. 1082-1086 ◽  
Author(s):  
Joanne B. Messick ◽  
Geoffrey Smith ◽  
Linda Berent ◽  
Sandra Cooper
Keyword(s):  
16S Rrna ◽  

PLoS ONE ◽  
2012 ◽  
Vol 7 (4) ◽  
pp. e35647 ◽  
Author(s):  
Josselin Bodilis ◽  
Sandrine Nsigue-Meilo ◽  
Ludovic Besaury ◽  
Laurent Quillet

2021 ◽  
Author(s):  
Soohyun Maeng ◽  
Yuna Park ◽  
Tuvshinzaya Damdintogtokh ◽  
Hyejin Oh ◽  
Minji Bang ◽  
...  

Abstract Two novel Gram-staining-negative bacterial strains BT553T and BT552T were isolated from soil collected in Gyeonggi province, Korea. Phylogenetic analysis using 16S rRNA gene sequences revealed that the strains BT553T and BT552T both belong to a distinct lineage in the genus Sphingomonas (family Sphingomonadaceae, order Sphingomonadales, class Alphaproteobacteria). Strain BT553T was closely related to Sphingomonas melonis DAPP-PG 224 T (98.1 % 16S rRNA gene similarity) and Sphingomonas aquatilis JSS7T (98.1%). Strain BT553T was closely related to Sphingomonas melonis DAPP-PG 224 T (98.2 %) and Sphingomonas aquatilis JSS7T (98.1%). The genome size of strain BT553T was 3,941,714 bp. Bacterial growth was observed at 10°C–30°C (optimum 25°C), pH 5.0–9.0 (optimum pH 7.0) in R2A agar and the presence of up to 2% NaCl. The genome size of strain BT552T was 4,035,561 bp. Bacterial growth was observed at 10°C–30°C (optimum 25°C), pH 5.0–9.0 (optimum pH 7.0) in R2A agar and in the presence up to 2% NaCl. The major cellular fatty acids of strains BT553T and BT552T were Summed Feature 3 and (16:1 ω 6c / 16:1 ω 7c), Summed Feature 8 (18:1 ω 7c / 18:1 ω 6c), and 14:0. In addition, their predominant respiratory quinone was Q-10. The major polar lipids of strain BT553T was identified to be diphosphatidylglycerol (DPG), phosphatidylethanolamine (PE), phosphatidylglycerol (PG), phosphatidylcholine (PC), and sphingoglycolipid (SL). The major polar lipids of strain BT552T was identified to be diphosphatidylglycerol (DPG), phosphatidylethanolamine (PE), phosphatidylglycerol (PG), phosphatidylcholine (PC), phospholipid (PL), and sphingoglycolipid (SL). Based on the biochemical, chemotaxonomic, and phylogenetic analyses, strains BT553T and BT552T are novel bacterial species within the genus Sphingomonas. The type strain of Sphingomonas negativus is BT553T (= KCTC 82095T = NBRC XXXXT) and the type strain of Sphingomonas gyeonggiense is BT552T (= KCTC 82094T = NBRC XXXXT).


2020 ◽  
Vol 70 (3) ◽  
pp. 1630-1638 ◽  
Author(s):  
Alexandra Pitt ◽  
Ulrike Koll ◽  
Johanna Schmidt ◽  
Martin W. Hahn

Strain 33A1-SZDPT was isolated from a small creek located in Puch, Austria. Strain SP-Ram-0.45-NSY-1T was obtained from a small pond located in Schönramer Moor, Germany. 16S rRNA gene sequence similarities between the type strain of Silvanigrella aquatica , currently the only member of the family Silvanigrellaceae , and strains 33A1-SZDPT and SP-Ram-0.45-NSY-1T of 94.1 and 99.1 %, respectively, suggested affiliation of the two strains with this family. Phylogenetic reconstructions with 16S rRNA gene sequences and phylogenomic analyses with amino acid sequences obtained from 103 single-copy genes suggested that the strains represent a new genus and a new species in the case of strain 33A1-SZDPT (=JCM 32978T=DSM 107810T), and a new species within the genus Silvanigrella in the case of strain SP-Ram-0.45-NSY-1T (=JCM 32975T=DSM 107809T). Cells of strain 33A1-SZDPT were motile, pleomorphic, purple-pigmented on agar plates, putatively due to violacein, and showed variable pigmentation in liquid media. They grew chemoorganotrophically and aerobically and tolerated salt concentrations up to 1.2 % NaCl (v/w). The genome size of strain 33A1-SZDPT was 3.4 Mbp and the G+C content was 32.2 mol%. For this new genus and new species, we propose the name Fluviispira multicolorata gen. nov., sp. nov. Cells of strain SP-Ram-0.45-NSY-1T were motile, pleomorphic, red-pigmented and grew chemoorganotrophically and aerobically. They tolerated salt concentrations up to 1.1 % NaCl (v/w). The genome size of strain SP-Ram-0.45-NSY-1T was 3.9 Mbp and the G+C content 29.3 mol%. For the new species within the genus Silvanigrella we propose the name Silvanigrella paludirubra sp. nov.


Sign in / Sign up

Export Citation Format

Share Document