scholarly journals Metagenome-validated Parallel Amplicon Sequencing and Text Mining-based Annotations for Simultaneous Profiling of Bacteria and Fungi: Vaginal Microbiome and Mycobiota in Healthy Women

Author(s):  
Seppo Virtanen ◽  
Schahzad Saqib ◽  
Tinja Kanerva ◽  
Pekka Nieminen ◽  
Ilkka Kalliala ◽  
...  

Abstract Background: Amplicon sequencing of kingdom-specific tags such as 16S rRNA gene for bacteria and internal transcribed spacer (ITS) region for fungi are widely used for investigating microbial populations. So far most human studies have focused on bacteria while studies on host-associated fungi in health and disease have only recently started to accumulate. To enable cost-effective parallel analysis of bacterial and fungal communities in human and environmental samples, we developed a method where 16S rRNA gene and ITS-1 amplicons were pooled together for a single Illumina MiSeq or HiSeq run and analysed after primer-based segregation. Taxonomic assignments were performed with Blast in combination with an iterative text-extraction based filtration approach, which uses extensive literature records from public databases to select the most probable hits that were further validated by shotgun metagenomic sequencing. Results: Using 50 vaginal samples, we show that the combined run provides comparable results on bacterial composition and diversity to conventional 16S rRNA gene amplicon sequencing. The text-extraction-based taxonomic assignment guided tool provided ecosystem specific annotations that were confirmed by Metagenomic Phylogenetic Analysis (MetaPhlAn). The metagenome analysis revealed distinct functional differences between the bacterial community types while fungi were undetected, despite being identified in all samples based on ITS amplicons. Co-abundance analysis of bacteria and fungi did not show strong between-kingdom correlations within the vaginal ecosystem of healthy women.Conclusion: Combined amplicon sequencing for bacteria and fungi provides a simple and cost-effective method for simultaneous analysis of microbiota and mycobiota within the same samples. Text extraction-based annotation tool facilitates the characterization and interpretation of defined microbial communities from rapidly accumulating sequencing and metadata readily available through public databases.

2018 ◽  
Author(s):  
Chiranjit Mukherjee ◽  
Clifford J. Beall ◽  
Ann L. Griffen ◽  
Eugene J. Leys

AbstractBackground:Sequencing of the 16S rRNA gene has been the standard for studying the composition of microbial communities. While it allows identification of bacteria at the level of species, it does not usually provide sufficient information to resolve at the sub-species level. Species-level resolution is not adequate for studies of transmission or stability, or for exploring subspecies variation in disease association. Current approaches using whole metagenome shotgun sequencing require very high coverage that can be cost-prohibitive and computationally challenging for diverse communities. Thus there is a need for high-resolution, yet cost-effective, high-throughput methods for characterizing microbial communities.Results:Significant improvement in resolution for amplicon-based bacterial community analysis was achieved by combining amplicon sequencing of a high-diversity marker gene, the ribosomal operon ISR, with a probabilistic error modeling algorithm, DADA2. The resolving power of this new approach was compared to that of both standard and high-resolution 16S-based approaches using a set of longitudinal subgingival plaque samples. The ISR strategy achieved a 5.2-fold increase in community richness compared to reference-based 16S rRNA gene analysis, and showed 100% accuracy in predicting the correct source of a clinical sample. Individuals’ microbial communities were highly personalized, and although they exhibited some drift in membership and levels over time, that difference was always smaller than the differences between any two subjects, even after one year. The construction of an ISR database from publicly available genomic sequences allowed us to explore genomic variationwithinspecies, resulting in the identification of multiple variants of the ISR for most species.Conclusions:The ISR approach resulted in significantly improved resolution of communities, and revealed a highly personalized, stable human oral microbiota. Multiple ISR types were observed for all species examined, demonstrating a high level of subspecies variation in the oral microbiota. The approach is high-throughput, high-resolution yet cost-effective, allowing subspecies-level community fingerprinting at a cost comparable to that of 16S rRNA gene amplicon sequencing. It will be useful for a range of applications that require high-resolution identification of organisms, including microbial tracking, community fingerprinting, and potentially for identification of virulence-associated strains.


Microbiome ◽  
2021 ◽  
Vol 9 (1) ◽  
Author(s):  
Janis R. Bedarf ◽  
Naiara Beraza ◽  
Hassan Khazneh ◽  
Ezgi Özkurt ◽  
David Baker ◽  
...  

Abstract Background Recent studies suggested the existence of (poly-)microbial infections in human brains. These have been described either as putative pathogens linked to the neuro-inflammatory changes seen in Parkinson’s disease (PD) and Alzheimer’s disease (AD) or as a “brain microbiome” in the context of healthy patients’ brain samples. Methods Using 16S rRNA gene sequencing, we tested the hypothesis that there is a bacterial brain microbiome. We evaluated brain samples from healthy human subjects and individuals suffering from PD (olfactory bulb and pre-frontal cortex), as well as murine brains. In line with state-of-the-art recommendations, we included several negative and positive controls in our analysis and estimated total bacterial biomass by 16S rRNA gene qPCR. Results Amplicon sequencing did detect bacterial signals in both human and murine samples, but estimated bacterial biomass was extremely low in all samples. Stringent reanalyses implied bacterial signals being explained by a combination of exogenous DNA contamination (54.8%) and false positive amplification of host DNA (34.2%, off-target amplicons). Several seemingly brain-enriched microbes in our dataset turned out to be false-positive signals upon closer examination. We identified off-target amplification as a major confounding factor in low-bacterial/high-host-DNA scenarios. These amplified human or mouse DNA sequences were clustered and falsely assigned to bacterial taxa in the majority of tested amplicon sequencing pipelines. Off-target amplicons seemed to be related to the tissue’s sterility and could also be found in independent brain 16S rRNA gene sequences. Conclusions Taxonomic signals obtained from (extremely) low biomass samples by 16S rRNA gene sequencing must be scrutinized closely to exclude the possibility of off-target amplifications, amplicons that can only appear enriched in biological samples, but are sometimes assigned to bacterial taxa. Sequences must be explicitly matched against any possible background genomes present in large quantities (i.e., the host genome). Using close scrutiny in our approach, we find no evidence supporting the hypothetical presence of either a brain microbiome or a bacterial infection in PD brains.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Francesco Durazzi ◽  
Claudia Sala ◽  
Gastone Castellani ◽  
Gerardo Manfreda ◽  
Daniel Remondini ◽  
...  

AbstractIn this paper we compared taxonomic results obtained by metataxonomics (16S rRNA gene sequencing) and metagenomics (whole shotgun metagenomic sequencing) to investigate their reliability for bacteria profiling, studying the chicken gut as a model system. The experimental conditions included two compartments of gastrointestinal tracts and two sampling times. We compared the relative abundance distributions obtained with the two sequencing strategies and then tested their capability to distinguish the experimental conditions. The results showed that 16S rRNA gene sequencing detects only part of the gut microbiota community revealed by shotgun sequencing. Specifically, when a sufficient number of reads is available, Shotgun sequencing has more power to identify less abundant taxa than 16S sequencing. Finally, we showed that the less abundant genera detected only by shotgun sequencing are biologically meaningful, being able to discriminate between the experimental conditions as much as the more abundant genera detected by both sequencing strategies.


2021 ◽  
Vol 1 (1) ◽  
Author(s):  
Sandra Reitmeier ◽  
Thomas C. A. Hitch ◽  
Nicole Treichel ◽  
Nikolaos Fikas ◽  
Bela Hausmann ◽  
...  

Abstract16S rRNA gene amplicon sequencing is a popular approach for studying microbiomes. However, some basic concepts have still not been investigated comprehensively. We studied the occurrence of spurious sequences using defined microbial communities based on data either from the literature or generated in three sequencing facilities and analyzed via both operational taxonomic units (OTUs) and amplicon sequence variants (ASVs) approaches. OTU clustering and singleton removal, a commonly used approach, delivered approximately 50% (mock communities) to 80% (gnotobiotic mice) spurious taxa. The fraction of spurious taxa was generally lower based on ASV analysis, but varied depending on the gene region targeted and the barcoding system used. A relative abundance of 0.25% was found as an effective threshold below which the analysis of spurious taxa can be prevented to a large extent in both OTU- and ASV-based analysis approaches. Using this cutoff improved the reproducibility of analysis, i.e., variation in richness estimates was reduced by 38% compared with singleton filtering using six human fecal samples across seven sequencing runs. Beta-diversity analysis of human fecal communities was markedly affected by both the filtering strategy and the type of phylogenetic distances used for comparison, highlighting the importance of carefully analyzing data before drawing conclusions on microbiome changes. In summary, handling of artifact sequences during bioinformatic processing of 16S rRNA gene amplicon data requires careful attention to avoid the generation of misleading findings. We propose the concept of effective richness to facilitate the comparison of alpha-diversity across studies.


Microbiome ◽  
2021 ◽  
Vol 9 (1) ◽  
Author(s):  
Yusuke Okazaki ◽  
Shohei Fujinaga ◽  
Michaela M. Salcher ◽  
Cristiana Callieri ◽  
Atsushi Tanaka ◽  
...  

Abstract Background Freshwater ecosystems are inhabited by members of cosmopolitan bacterioplankton lineages despite the disconnected nature of these habitats. The lineages are delineated based on > 97% 16S rRNA gene sequence similarity, but their intra-lineage microdiversity and phylogeography, which are key to understanding the eco-evolutional processes behind their ubiquity, remain unresolved. Here, we applied long-read amplicon sequencing targeting nearly full-length 16S rRNA genes and the adjacent ribosomal internal transcribed spacer sequences to reveal the intra-lineage diversities of pelagic bacterioplankton assemblages in 11 deep freshwater lakes in Japan and Europe. Results Our single nucleotide-resolved analysis, which was validated using shotgun metagenomic sequencing, uncovered 7–101 amplicon sequence variants for each of the 11 predominant bacterial lineages and demonstrated sympatric, allopatric, and temporal microdiversities that could not be resolved through conventional approaches. Clusters of samples with similar intra-lineage population compositions were identified, which consistently supported genetic isolation between Japan and Europe. At a regional scale (up to hundreds of kilometers), dispersal between lakes was unlikely to be a limiting factor, and environmental factors or genetic drift were potential determinants of population composition. The extent of microdiversification varied among lineages, suggesting that highly diversified lineages (e.g., Iluma-A2 and acI-A1) achieve their ubiquity by containing a consortium of genotypes specific to each habitat, while less diversified lineages (e.g., CL500-11) may be ubiquitous due to a small number of widespread genotypes. The lowest extent of intra-lineage diversification was observed among the dominant hypolimnion-specific lineage (CL500-11), suggesting that their dispersal among lakes is not limited despite the hypolimnion being a more isolated habitat than the epilimnion. Conclusions Our novel approach complemented the limited resolution of short-read amplicon sequencing and limited sensitivity of the metagenome assembly-based approach, and highlighted the complex ecological processes underlying the ubiquity of freshwater bacterioplankton lineages. To fully exploit the performance of the method, its relatively low read throughput is the major bottleneck to be overcome in the future.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Jun-ichi Kanatani ◽  
Masanori Watahiki ◽  
Keiko Kimata ◽  
Tomoko Kato ◽  
Kaoru Uchida ◽  
...  

Abstract Background Legionellosis is caused by the inhalation of aerosolized water contaminated with Legionella bacteria. In this study, we investigated the prevalence of Legionella species in aerosols collected from outdoor sites near asphalt roads, bathrooms in public bath facilities, and other indoor sites, such as buildings and private homes, using amoebic co-culture, quantitative PCR, and 16S rRNA gene amplicon sequencing. Results Legionella species were not detected by amoebic co-culture. However, Legionella DNA was detected in 114/151 (75.5%) air samples collected near roads (geometric mean ± standard deviation: 1.80 ± 0.52 log10 copies/m3), which was comparable to the numbers collected from bathrooms [15/21 (71.4%), 1.82 ± 0.50] but higher than those collected from other indoor sites [11/30 (36.7%), 0.88 ± 0.56] (P < 0.05). The amount of Legionella DNA was correlated with the monthly total precipitation (r = 0.56, P < 0.01). It was also directly and inversely correlated with the daily total precipitation for seven days (r = 0.21, P = 0.01) and one day (r = − 0.29, P < 0.01) before the sampling day, respectively. 16S rRNA gene amplicon sequencing revealed that Legionella species were detected in 9/30 samples collected near roads (mean proportion of reads, 0.11%). At the species level, L. pneumophila was detected in 2/30 samples collected near roads (the proportion of reads, 0.09 and 0.11% of the total reads number in each positive sample). The three most abundant bacterial genera in the samples collected near roads were Sphingomonas, Streptococcus, and Methylobacterium (mean proportion of reads; 21.1%, 14.6%, and 1.6%, respectively). In addition, the bacterial diversity in outdoor environment was comparable to that in indoor environment which contains aerosol-generating features and higher than that in indoor environment without the features. Conclusions DNA from Legionella species was widely present in aerosols collected from outdoor sites near asphalt roads, especially during the rainy season. Our findings suggest that there may be a risk of exposure to Legionella species not only in bathrooms but also in the areas surrounding asphalt roads. Therefore, the possibility of contracting legionellosis in daily life should be considered.


2021 ◽  
Vol 12 ◽  
Author(s):  
Faten Ghodhbane-Gtari ◽  
Timothy D’Angelo ◽  
Abdellatif Gueddou ◽  
Sabrine Ghazouani ◽  
Maher Gtari ◽  
...  

Actinorhizal plants host mutualistic symbionts of the nitrogen-fixing actinobacterial genus Frankia within nodule structures formed on their roots. Several plant-growth-promoting bacteria have also been isolated from actinorhizal root nodules, but little is known about them. We were interested investigating the in planta microbial community composition of actinorhizal root nodules using culture-independent techniques. To address this knowledge gap, 16S rRNA gene amplicon and shotgun metagenomic sequencing was performed on DNA from the nodules of Casuarina glauca. DNA was extracted from C. glauca nodules collected in three different sampling sites in Tunisia, along a gradient of aridity ranging from humid to arid. Sequencing libraries were prepared using Illumina NextEra technology and the Illumina HiSeq 2500 platform. Genome bins extracted from the metagenome were taxonomically and functionally profiled. Community structure based off preliminary 16S rRNA gene amplicon data was analyzed via the QIIME pipeline. Reconstructed genomes were comprised of members of Frankia, Micromonospora, Bacillus, Paenibacillus, Phyllobacterium, and Afipia. Frankia dominated the nodule community at the humid sampling site, while the absolute and relative prevalence of Frankia decreased at the semi-arid and arid sampling locations. Actinorhizal plants harbor similar non-Frankia plant-growth-promoting-bacteria as legumes and other plants. The data suggests that the prevalence of Frankia in the nodule community is influenced by environmental factors, with being less abundant under more arid environments.


Genes ◽  
2018 ◽  
Vol 9 (5) ◽  
pp. 231 ◽  
Author(s):  
Ekaterina Avershina ◽  
Inga Angell ◽  
Melanie Simpson ◽  
Ola Storrø ◽  
Torbjørn Øien ◽  
...  

2019 ◽  
Vol 47 (18) ◽  
pp. e103-e103 ◽  
Author(s):  
Benjamin J Callahan ◽  
Joan Wong ◽  
Cheryl Heiner ◽  
Steve Oh ◽  
Casey M Theriot ◽  
...  

AbstractTargeted PCR amplification and high-throughput sequencing (amplicon sequencing) of 16S rRNA gene fragments is widely used to profile microbial communities. New long-read sequencing technologies can sequence the entire 16S rRNA gene, but higher error rates have limited their attractiveness when accuracy is important. Here we present a high-throughput amplicon sequencing methodology based on PacBio circular consensus sequencing and the DADA2 sample inference method that measures the full-length 16S rRNA gene with single-nucleotide resolution and a near-zero error rate. In two artificial communities of known composition, our method recovered the full complement of full-length 16S sequence variants from expected community members without residual errors. The measured abundances of intra-genomic sequence variants were in the integral ratios expected from the genuine allelic variants within a genome. The full-length 16S gene sequences recovered by our approach allowed Escherichia coli strains to be correctly classified to the O157:H7 and K12 sub-species clades. In human fecal samples, our method showed strong technical replication and was able to recover the full complement of 16S rRNA alleles in several E. coli strains. There are likely many applications beyond microbial profiling for which high-throughput amplicon sequencing of complete genes with single-nucleotide resolution will be of use.


Sign in / Sign up

Export Citation Format

Share Document