scholarly journals NanoCLUST: a species-level analysis of 16S rRNA nanopore sequencing data

Author(s):  
Héctor Rodríguez-Pérez ◽  
Laura Ciuffreda ◽  
Carlos Flores

AbstractSummaryNanoCLUST is an analysis pipeline for classification of amplicon-based full-length 16S rRNA nanopore reads. It is characterized by an unsupervised read clustering step, based on Uniform Manifold Approximation and Projection (UMAP), followed by the construction of a polished read and subsequent Blast classification. Here we demonstrate that NanoCLUST performs better than other state-of-the-art software in the characterization of two commercial mock communities, enabling accurate bacterial identification and abundance profile estimation at species level resolution.Availability and implementationSource code, test data and documentation of NanoCLUST is freely available at https://github.com/genomicsITER/NanoCLUST under MIT [email protected]

Author(s):  
Héctor Rodríguez-Pérez ◽  
Laura Ciuffreda ◽  
Carlos Flores

Abstract Summary NanoCLUST is an analysis pipeline for the classification of amplicon-based full-length 16S rRNA nanopore reads. It is characterized by an unsupervised read clustering step, based on Uniform Manifold Approximation and Projection (UMAP), followed by the construction of a polished read and subsequent Blast classification. Here, we demonstrate that NanoCLUST performs better than other state-of-the-art software in the characterization of two commercial mock communities, enabling accurate bacterial identification and abundance profile estimation at species-level resolution. Availability and implementation Source code, test data and documentation of NanoCLUST are freely available at https://github.com/genomicsITER/NanoCLUST under MIT License. Supplementary information Supplementary data are available at Bioinformatics online.


2018 ◽  
Author(s):  
Arghavan Bahadorinejad ◽  
Ivan Ivanov ◽  
Johanna W Lampe ◽  
Meredith AJ Hullar ◽  
Robert S Chapkin ◽  
...  

AbstractWe propose a Bayesian method for the classification of 16S rRNA metagenomic profiles of bacterial abundance, by introducing a Poisson-Dirichlet-Multinomial hierarchical model for the sequencing data, constructing a prior distribution from sample data, calculating the posterior distribution in closed form; and deriving an Optimal Bayesian Classifier (OBC). The proposed algorithm is compared to state-of-the-art classification methods for 16S rRNA metagenomic data, including Random Forests and the phylogeny-based Metaphyl algorithm, for varying sample size, classification difficulty, and dimensionality (number of OTUs), using both synthetic and real metagenomic data sets. The results demonstrate that the proposed OBC method, with either noninformative or constructed priors, is competitive or superior to the other methods. In particular, in the case where the ratio of sample size to dimensionality is small, it was observed that the proposed method can vastly outperform the others.Author summaryRecent studies have highlighted the interplay between host genetics, gut microbes, and colorectal tumor initiation/progression. The characterization of microbial communities using metagenomic profiling has therefore received renewed interest. In this paper, we propose a method for classification, i.e., prediction of different outcomes, based on 16S rRNA metagenomic data. The proposed method employs a Bayesian approach, which is suitable for data sets with small ration of number of available instances to the dimensionality. Results using both synthetic and real metagenomic data show that the proposed method can outperform other state-of-the-art metagenomic classification algorithms.


2021 ◽  
Author(s):  
Andrew E. Schriefer ◽  
Brajendra Kumar ◽  
Avihai Zolty ◽  
Preetam R ◽  
Adam Didier ◽  
...  

The M-CAMP™ (Microbiome Computational Analysis for Multiomic Profiling) Cloud Platform was designed to provide users with an easy-to-use web interface to access best in class microbiome analysis tools. This interface allows bench scientists to conduct bioinformatic analysis on their samples and then download publication-ready graphics and reports. The core pipeline of the platform is the 16S-seq taxonomic classification algorithm which provides species-level classification of Illumina 16s sequencing. This algorithm uses a novel approach combining alignment and kmer based taxonomic classification methodologies to produce a highly accurate and comprehensive profile. Additionally, a comprehensive proprietary database combining reference sequences from multiple sources was curated and contains 18056 unique V3-V4 sequences covering 11527 species. The M-CAMPTM 16S taxonomic classification algorithm was validated on 52 sequencing samples from both public and in-house standard sample mixtures with known fractions. Compared to current popular public classification algorithms, our classification algorithm provides the most accurate species-level classification of 16S rRNA sequencing data.


mBio ◽  
2012 ◽  
Vol 3 (5) ◽  
Author(s):  
Dea Shahinas ◽  
Michael Silverman ◽  
Taylor Sittler ◽  
Charles Chiu ◽  
Peter Kim ◽  
...  

ABSTRACT Fecal microbiome transplantation by low-volume enema is an effective, safe, and inexpensive alternative to antibiotic therapy for patients with chronic relapsing Clostridium difficile infection (CDI). We explored the microbial diversity of pre- and posttransplant stool specimens from CDI patients (n = 6) using deep sequencing of the 16S rRNA gene. While interindividual variability in microbiota change occurs with fecal transplantation and vancomycin exposure, in this pilot study we note that clinical cure of CDI is associated with an increase in diversity and richness. Genus- and species-level analysis may reveal a cocktail of microorganisms or products thereof that will ultimately be used as a probiotic to treat CDI. IMPORTANCE Antibiotic-associated diarrhea (AAD) due to Clostridium difficile is a widespread phenomenon in hospitals today. Despite the use of antibiotics, up to 30% of patients are unable to clear the infection and suffer recurrent bouts of diarrheal disease. As a result, clinicians have resorted to fecal microbiome transplantation (FT). Donor stool for this type of therapy is typically obtained from a spouse or close relative and thoroughly tested for various pathogenic microorganisms prior to infusion. Anecdotal reports suggest a very high success rate of FT in patients who fail antibiotic treatment (>90%). We used deep-sequencing technology to explore the human microbial diversity in patients with Clostridium difficile infection (CDI) disease after FT. Genus- and species-level analysis revealed a cocktail of microorganisms in the Bacteroidetes and Firmicutes phyla that may ultimately be used as a probiotic to treat CDI.


2018 ◽  
Author(s):  
Anna Cusco ◽  
Carlotta Catozzi ◽  
Joaquim Vines ◽  
Armand Sanchez ◽  
Olga Francino

Background: Profiling microbiome on low biomass samples is challenging for metagenomics since these samples are prone to present DNA from other sources, such as the host or the environment. The usual approach is sequencing specific hypervariable regions of the 16S rRNA gene, which fails to assign taxonomy to genus and species level. Here, we aim to assess long-amplicon PCR-based approaches for assigning taxonomy at the genus and species level. We use Nanopore sequencing with two different markers: full-length 16S rRNA (~1,500 bp) and the whole rrn operon (16S rRNA gene - ITS - 23S rRNA gene; 4,500 bp). Methods: We sequenced a clinical isolate of Staphylococcus pseudintermedius, two mock communities (HM-783D, Bei Resources; D6306, ZymoBIOMICS) and two pools of low-biomass samples (dog skin). Nanopore sequencing was performed on MinION (Oxford Nanopore Technologies) using 1D PCR barcoding kit. Sequences were pre-processed, and data were analyzed using WIMP workflow on EPI2ME (ONT) or Minimap2 software with rrn database. Results: Full-length 16S rRNA and the rrn operon retrieved the microbiota composition from the bacterial isolate, the mock communities and the complex skin samples, even at the genus and species level. For Staphylococcus pseudintermedius isolate, when using EPI2ME, the amplicons were assigned to the correct bacterial species in ~98% of the cases with rrn operon as the marker, and ~68% of the cases with 16S rRNA gene respectively. In both skin microbiota samples, we detected many species with an environmental origin. In chin, we found different Pseudomonas species in high abundance, whereas in the dorsal skin there were more taxa with lower abundances. Conclusions: Both full-length 16S rRNA and the rrn operon retrieved the microbiota composition of simple and complex microbial communities, even from the low-biomass samples such as dog skin. For an increased resolution at the species level, rrn operon would be the best choice.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Yoshiyuki Matsuo ◽  
Shinnosuke Komiya ◽  
Yoshiaki Yasumizu ◽  
Yuki Yasuoka ◽  
Katsura Mizushima ◽  
...  

Abstract Background Species-level genetic characterization of complex bacterial communities has important clinical applications in both diagnosis and treatment. Amplicon sequencing of the 16S ribosomal RNA (rRNA) gene has proven to be a powerful strategy for the taxonomic classification of bacteria. This study aims to improve the method for full-length 16S rRNA gene analysis using the nanopore long-read sequencer MinION™. We compared it to the conventional short-read sequencing method in both a mock bacterial community and human fecal samples. Results We modified our existing protocol for full-length 16S rRNA gene amplicon sequencing by MinION™. A new strategy for library construction with an optimized primer set overcame PCR-associated bias and enabled taxonomic classification across a broad range of bacterial species. We compared the performance of full-length and short-read 16S rRNA gene amplicon sequencing for the characterization of human gut microbiota with a complex bacterial composition. The relative abundance of dominant bacterial genera was highly similar between full-length and short-read sequencing. At the species level, MinION™ long-read sequencing had better resolution for discriminating between members of particular taxa such as Bifidobacterium, allowing an accurate representation of the sample bacterial composition. Conclusions Our present microbiome study, comparing the discriminatory power of full-length and short-read sequencing, clearly illustrated the analytical advantage of sequencing the full-length 16S rRNA gene.


Author(s):  
Bo Zhang ◽  
Matthew Brock ◽  
Carlos Arana ◽  
Chaitanya Dende ◽  
Nicolai Stanislas van Oers ◽  
...  

Bead-beating within a DNA extraction protocol is critical for complete microbial cell lysis and accurate assessment of the abundance and composition of the microbiome. While the impact of bead-beating on the recovery of OTUs at the phylum and class level have been studied, its influence on species-level microbiome recovery is not clear. Recent advances in sequencing technology has allowed species-level resolution of the microbiome using full length 16S rRNA gene sequencing instead of smaller amplicons that only capture a few hypervariable regions of the gene. We sequenced the v3-v4 hypervariable region as well as the full length 16S rRNA gene in mouse and human stool samples and discovered major clusters of gut bacteria that exhibit different levels of sensitivity to bead-beating treatment. Full length 16S rRNA gene sequencing unraveled vast species diversity in the mouse and human gut microbiome and enabled characterization of several unclassified OTUs in amplicon data. Many species of major gut commensals such as Bacteroides, Lactobacillus, Blautia, Clostridium, Escherichia, Roseburia, Helicobacter, and Ruminococcus were identified. Interestingly, v3-v4 amplicon data classified about 50% of Ruminococcus reads as Ruminococcus gnavus species which showed maximum abundance in a 9 min beaten sample. However, the remaining 50% of reads could not be assigned to any species. Full length 16S rRNA gene sequencing data showed that the majority of the unclassified reads were Ruminococcus albus species which unlike R. gnavus showed maximum recovery in the unbeaten sample instead. Furthermore, we found that the Blautia hominis and Streptococcus parasanguinis species were differently sensitive to bead-beating treatment than the rest of the species in these genera. Thus, the present study demonstrates species level variations in sensitivity to bead-beating treatment that could only be resolved with full length 16S rRNA sequencing. This study identifies species of common gut commensals and potential pathogens that require minimum (0-1 min) or extensive (4-9 min) bead-beating for their maximal recovery.


F1000Research ◽  
2019 ◽  
Vol 7 ◽  
pp. 1755 ◽  
Author(s):  
Anna Cuscó ◽  
Carlotta Catozzi ◽  
Joaquim Viñes ◽  
Armand Sanchez ◽  
Olga Francino

Background: Profiling the microbiome of low-biomass samples is challenging for metagenomics since these samples are prone to contain DNA from other sources (e.g. host or environment). The usual approach is sequencing short regions of the 16S rRNA gene, which fails to assign taxonomy to genus and species level. To achieve an increased taxonomic resolution, we aim to develop long-amplicon PCR-based approaches using Nanopore sequencing. We assessed two different genetic markers: the full-length 16S rRNA (~1,500 bp) and the 16S-ITS-23S region from the rrn operon (4,300 bp). Methods: We sequenced a clinical isolate of Staphylococcus pseudintermedius, two mock communities and two pools of low-biomass samples (dog skin). Nanopore sequencing was performed on MinION™ using the 1D PCR barcoding kit. Sequences were pre-processed, and data were analyzed using EPI2ME or Minimap2 with rrn database. Consensus sequences of the 16S-ITS-23S genetic marker were obtained using canu. Results: The full-length 16S rRNA and the 16S-ITS-23S region of the rrn operon were used to retrieve the microbiota composition of the samples at the genus and species level. For the Staphylococcus pseudintermedius isolate, the amplicons were assigned to the correct bacterial species in ~98% of the cases with the16S-ITS-23S genetic marker, and in ~68%, with the 16S rRNA gene when using EPI2ME. Using mock communities, we found that the full-length 16S rRNA gene represented better the abundances of a microbial community; whereas, 16S-ITS-23S obtained better resolution at the species level. Finally, we characterized low-biomass skin microbiota samples and detected species with an environmental origin. Conclusions: Both full-length 16S rRNA and the 16S-ITS-23S of the rrn operon retrieved the microbiota composition of simple and complex microbial communities, even from the low-biomass samples such as dog skin. For an increased resolution at the species level, targeting the 16S-ITS-23S of the rrn operon would be the best choice.


Gigabyte ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-15
Author(s):  
Xiaohuan Sun ◽  
Yue-Hua Hu ◽  
Jingjing Wang ◽  
Chao Fang ◽  
Jiguang Li ◽  
...  

Metabarcoding is a widely used method for fast characterization of microbial communities in complex environmental samples. However, the selction of sequencing platform can have a noticeable effect on the estimated community composition. Here, we evaluated the metabarcoding performance of a DNBSEQ-G400 sequencer developed by MGI Tech using 16S and internal transcribed spacer (ITS) markers to investigate bacterial and fungal mock communities, as well as the ITS2 marker to investigate the fungal community of 1144 soil samples, with additional technical replicates. We show that highly accurate sequencing of bacterial and fungal communities is achievable using DNBSEQ-G400. Measures of diversity and correlation from soil metabarcoding showed that the results correlated highly with those of different machines of the same model, as well as between different sequencing modes (single-end 400 bp and paired-end 200 bp). Moderate, but significant differences were observed between results produced with different sequencing platforms (DNBSEQ-G400 and MiSeq); however, the highest differences can be caused by selecting different primer pairs for PCR amplification of taxonomic markers. These differences suggested that care is needed while jointly analyzing metabarcoding data from differenet experiments. This study demonstrated the high performance and accuracy of DNBSEQ-G400 for short-read metabarcoding of microbial communities. Our study also produced datasets to allow further investigation of microbial diversity.


Sign in / Sign up

Export Citation Format

Share Document