scholarly journals Status of the Archaeal and Bacterial Census: an Update

mBio ◽  
2016 ◽  
Vol 7 (3) ◽  
Author(s):  
Patrick D. Schloss ◽  
Rene A. Girard ◽  
Thomas Martin ◽  
Joshua Edwards ◽  
J. Cameron Thrash

ABSTRACT A census is typically carried out for people across a range of geographical levels; however, microbial ecologists have implemented a molecular census of bacteria and archaea by sequencing their 16S rRNA genes. We assessed how well the census of full-length 16S rRNA gene sequences is proceeding in the context of recent advances in high-throughput sequencing technologies because full-length sequences are typically used as references for classification of the short sequences generated by newer technologies. Among the 1,411,234 and 53,546 full-length bacterial and archaeal sequences, 94.5% and 95.1% of the bacterial and archaeal sequences, respectively, belonged to operational taxonomic units (OTUs) that have been observed more than once. Although these metrics suggest that the census is approaching completion, 29.2% of the bacterial and 38.5% of the archaeal OTUs have been observed more than once. Thus, there is still considerable diversity to be explored. Unfortunately, the rate of new full-length sequences has been declining, and new sequences are primarily being deposited by a small number of studies. Furthermore, sequences from soil and aquatic environments, which are known to be rich in bacterial diversity, represent only 7.8 and 16.5% of the census, while sequences associated with host-associated environments represent 55.0% of the census. Continued use of traditional approaches and new technologies such as single-cell genomics and short-read assembly are likely to improve our ability to sample rare OTUs if it is possible to overcome this sampling bias. The success of ongoing efforts to use short-read sequencing to characterize archaeal and bacterial communities requires that researchers strive to expand the depth and breadth of this census. IMPORTANCE The biodiversity contained within the bacterial and archaeal domains dwarfs that of the eukaryotes, and the services these organisms provide to the biosphere are critical. Surprisingly, we have done a relatively poor job of formally tracking the quality of the biodiversity as represented in full-length 16S rRNA genes. By understanding how this census is proceeding, it is possible to suggest the best allocation of resources for advancing the census. We found that the ongoing effort has done an excellent job of sampling the most abundant organisms but struggles to sample the rarer organisms. Through the use of new sequencing technologies, we should be able to obtain full-length sequences from these rare organisms. Furthermore, we suggest that by allocating more resources to sampling environments known to have the greatest biodiversity, we will be able to make significant advances in our characterization of archaeal and bacterial diversity.

2016 ◽  
Author(s):  
Patrick D Schloss ◽  
Rene Girard ◽  
Thomas Martin ◽  
Joshua Edwards ◽  
J. Cameron Thrash

A census is typically carried out for people at a national level; however, microbial ecologists have implemented a molecular census of bacteria and archaea by sequencing their 16S rRNA genes. We assessed how well the microbial census of full-length 16S rRNA gene sequences is proceeding in the context of recent advances in high throughput sequencing technologies. Among the 1,411,234 and 53,546 full-length bacterial and archaeal sequences sequences, 94.5% and 95.1% of the bacterial and archeaeal sequences, respectively, belonged to operational taxonomic units (OTUs) that have been observed more than once. Although these metrics suggest that the census is approaching completion, 29.2% of the bacterial and 38.5% of the archaeal OTUs have been observed more than once. Thus, there is still considerable microbial diversity to be explored. Unfortunately, the rate of new full-length sequences has been declining and new sequences are primarily being deposited by a small number of studies. Furthermore, sequences from soil and aquatic environments, which are known to be rich in bacterial diversity, only represent 7.8 and 16.5% of the census while sequences associated with zoonotic environments represent 55.0% of the census. Continued use of traditional approaches and new technologies such as single cell genomics and short read assembly are likely to improve our ability to sample rare OTUs if it is possible to overcome this sampling bias. The success of ongoing efforts to use short read sequencing to characterize microbial communities requires that researchers strive to expand the depth and breadth of the microbial census.


PeerJ ◽  
2016 ◽  
Vol 4 ◽  
pp. e2492 ◽  
Author(s):  
Catherine M. Burke ◽  
Aaron E. Darling

BackgroundThe bacterial 16S rRNA gene has historically been used in defining bacterial taxonomy and phylogeny. However, there are currently no high-throughput methods to sequence full-length 16S rRNA genes present in a sample with precision.ResultsWe describe a method for sequencing near full-length 16S rRNA gene amplicons using the high throughput Illumina MiSeq platform and test it using DNA from human skin swab samples. Proof of principle of the approach is demonstrated, with the generation of 1,604 sequences greater than 1,300 nt from a single Nano MiSeq run, with accuracy estimated to be 100-fold higher than standard Illumina reads. The reads were chimera filtered using information from a single molecule dual tagging scheme that boosts the signal available for chimera detection.ConclusionsThis method could be scaled up to generate many thousands of sequences per MiSeq run and could be applied to other sequencing platforms. This has great potential for populating databases with high quality, near full-length 16S rRNA gene sequences from under-represented taxa and environments and facilitates analyses of microbial communities at higher resolution.


2021 ◽  
Vol 12 ◽  
Author(s):  
Li Ma ◽  
Geng Wu ◽  
Jian Yang ◽  
Liuqin Huang ◽  
Dorji Phurbu ◽  
...  

Investigating the distribution of hydrogen-producing bacteria (HPB) is of great significance to understanding the source of biological hydrogen production in geothermal environments. Here, we explored the compositions of HPB populations in the sediments of hot springs from the Daggyai, Quzhuomu, Quseyongba, and Moluojiang geothermal zones on the Tibetan Plateau, with the use of Illumina MiSeq high-throughput sequencing of 16S rRNA genes and hydA genes. In the present study, the hydA genes were successfully amplified from the hot springs with a temperature of 46–87°C. The hydA gene phylogenetic analysis showed that the top three phyla of the HPB populations were Bacteroidetes (14.48%), Spirochaetes (14.12%), and Thermotogae (10.45%), while Proteobacteria were absent in the top 10 of the HPB populations, although Proteobacteria were dominant in the 16S rRNA gene sequences. Canonical correspondence analysis results indicate that the HPB community structure in the studied Tibetan hot springs was correlated with various environmental factors, such as temperature, pH, and elevation. The HPB community structure also showed a spatial distribution pattern; samples from the same area showed similar community structures. Furthermore, one HPB isolate affiliated with Firmicutes was obtained and demonstrated the capacity of hydrogen production. These results are important for us to understand the distribution and function of HPB in hot springs.


2018 ◽  
Author(s):  
Joshua P. Earl ◽  
Nithin D. Adappa ◽  
Jaroslaw Krol ◽  
Archana S. Bhat ◽  
Sergey Balashov ◽  
...  

AbstractBackgroundPan-bacterial 16S rRNA microbiome surveys performed with massively parallel DNA sequencing technologies have transformed community microbiological studies. Current 16S profiling methods, however, fail to provide sufficient taxonomic resolution and accuracy to adequately perform species-level associative studies for specific conditions. This is due to the amplification and sequencing of only short 16S rRNA gene regions, typically providing for only family- or genus-level taxonomy. Moreover, sequencing errors often inflate the number of taxa present. Pacific Biosciences’ (PacBio’s) long-read technology in particular suffers from high error rates per base. Herein we present a microbiome analysis pipeline that takes advantage of PacBio circular consensus sequencing (CCS) technology to sequence and error correct full-length bacterial 16S rRNA genes, which provides high-fidelity species-level microbiome dataResultsAnalysis of a mock community with 20 bacterial species demonstrated 100% specificity and sensitivity. Examination of a 250-plus species mock community demonstrated correct species-level classification of >90% of taxa and relative abundances were accurately captured. The majority of the remaining taxa were demonstrated to be multiply, incorrectly, or incompletely classified. Using this methodology, we examined the microgeographic variation present among the microbiomes of six sinonasal sites, by both swab and biopsy, from the anterior nasal cavity to the sphenoid sinus from 12 subjects undergoing trans-sphenoidal hypophysectomy. We found greater variation among subjects than among sites within a subject, although significant within-individual differences were also observed.Propiniobacterium acnes(recently renamedCutibacterium acnes[1]) was the predominant species throughout, but was found at distinct relative abundances by site.ConclusionsOur microbial composition analysis pipeline for single-molecule real-time 16S rRNA gene sequencing (MCSMRT,https://github.com/jpearl01/mcsmrt) overcomes deficits of standard marker gene based microbiome analyses by using CCS of entire 16S rRNA genes to provide increased taxonomic and phylogenetic resolution. Extensions of this approach to other marker genes could help refine taxonomic assignments of microbial species and improve reference databases, as well as strengthen the specificity of associations between microbial communities and dysbiotic states.


Atmosphere ◽  
2020 ◽  
Vol 11 (8) ◽  
pp. 802
Author(s):  
Hokyung Song ◽  
Ian Crawford ◽  
Jonathan Lloyd ◽  
Clare Robinson ◽  
Christopher Boothman ◽  
...  

Primary biological aerosols often include allergenic and pathogenic microorganisms posing potential risks to human health. Moreover, there are airborne plant and animal pathogens that may have ecological and economic impact. In this study, we used high-throughput sequencing techniques (Illumina, MiSeq) targeting the 16S rRNA genes of bacteria and the 18S rRNA genes of eukaryotes, to characterize airborne primary biological aerosols. We used a filtration system on the UK Facility for Airborne Atmospheric Measurements (FAAM) research aircraft to sample a range of primary biological aerosols across southern England overflying surface measurement sites from Chilbolton to Weybourne. We identified 30 to 60 bacterial operational taxonomic units (OTUs) and 108 to 224 eukaryotic OTUs per sample. Moreover, 16S rRNA gene sequencing identified significant numbers of genera that have not been found in atmospheric samples previously or only been described in limited number of atmospheric field studies, which are rather old or published in local journals. This includes the genera Gordonia, Lautropia, and Psychroglaciecola. Some of the bacterial genera found in this study include potential human pathogens, for example, Gordonia, Sphingomonas, Chryseobacterium, Morganella, Fusobacterium, and Streptococcus. 18S rRNA gene sequencing showed Cladosporium to be the major genus in all of the samples, which is a well-known allergen and often found in the atmosphere. There were also genetic signatures of potentially allergenic taxa; for example, Pleosporales, Phoma, and Brassicales. Although there was no significant clustering of bacterial and eukaryotic communities depending on the sampling location, we found meteorological factors explaining significant variations in the community composition. The findings in this study support the application of DNA-based sequencing technologies for atmospheric science studies in combination with complementary spectroscopic and microscopic techniques for improved identification of primary biological aerosols.


Microbiome ◽  
2020 ◽  
Vol 8 (1) ◽  
Author(s):  
Luyang Song ◽  
Kabin Xie

Abstract Background High-throughput sequencing of bacterial 16S rRNA gene (16S-seq) is a useful and common method for studying bacterial community structures. However, contamination of the 16S rRNA genes from the mitochondrion and plastid hinders the sensitive bacterial 16S-seq in plant microbiota profiling, especially for some plant species such as rice. To date, efficiently mitigating such host contamination without a bias is challenging in 16S rRNA gene-based amplicon sequencing. Results We developed Cas-16S-seq method to reduce abundant host contamination for plant microbiota profiling. This method utilizes the Cas9 nuclease and specific guide RNA (gRNA) to cut 16S rRNA targets during library construction, thereby removing host contamination in 16S-seq. We used rice as an example to validate the feasibility and effectiveness of Cas-16S-seq. We established a bioinformatics pipeline to design gRNAs that specifically target rice 16S rRNA genes without bacterial 16S rRNA off-targets. We compared the effectiveness of Cas-16S-seq with that of the commonly used 16S-seq method for artificially mixed 16S rRNA gene communities, paddy soil, rice root, and phyllosphere samples. The results showed that Cas-16S-seq substantially reduces the fraction of rice 16S rRNA gene sequences from 63.2 to 2.9% in root samples and from 99.4 to 11.6% in phyllosphere samples on average. Consequently, Cas-16S-seq detected more bacterial species than the 16S-seq in plant samples. Importantly, when analyzing soil samples, Cas-16S-seq and 16S-seq showed almost identical bacterial communities, suggesting that Cas-16S-seq with host-specific gRNAs that we designed has no off-target in rice microbiota profiling. Conclusion Our Cas-16S-seq can efficiently remove abundant host contamination without a bias for 16S rRNA gene-based amplicon sequencing, thereby enabling deeper bacterial community profiling with a low cost and high flexibility. Thus, we anticipate that this method would be a useful tool for plant microbiomics.


2021 ◽  
Author(s):  
Patrick D. Schloss

AbstractAmplicon sequencing variants (ASVs) have been proposed as an alternative to operational taxonomic units (OTUs) for analyzing microbial communities. ASVs have grown in popularity, in part, because of a desire to reflect a more refined level of taxonomy since they do not cluster sequences based on a distance-based threshold. However, ASVs and the use of overly narrow thresholds to identify OTUs increase the risk of splitting a single genome into separate clusters. To assess this risk, I analyzed the intragenomic variation of 16S rRNA genes from the bacterial genomes represented in a rrn copy number database, which contained 20,427 genomes from 5,972 species. As the number of copies of the 16S rRNA gene increased in a genome, the number of ASVs also increased. There was an average of 0.58 ASVs per copy of the 16S rRNA gene for full length 16S rRNA genes. It was necessary to use a distance threshold of 5.25% to cluster full length ASVs from the same genome into a single OTU with 95% confidence for genomes with 7 copies of the 16S rRNA, such as E. coli. This research highlights the risk of splitting a single bacterial genome into separate clusters when ASVs are used to analyze 16S rRNA gene sequence data. Although there is also a risk of clustering ASVs from different species into the same OTU when using broad distance thresholds, those risks are of less concern than artificially splitting a genome into separate ASVs and OTUs.


Biologia ◽  
2014 ◽  
Vol 69 (6) ◽  
Author(s):  
Jin Huang ◽  
Zhe Liu ◽  
Yong Li ◽  
Jian Wang

AbstractThe bacterial diversity in saline-alkali ponds rearing common carp was investigated using the 16S rRNA gene clone library technique. Phylogenetic analysis of the most common and dominant sequences recovered indicated that these sequences fell into the following major lineages, including Proteobacteria (α-, β-, γ-), Actinobacteria, Cyanobacteria, Planctomycetes, Fibrobacteres, Bacteroidetes, Chloroflexi, and unclassified bacteria. Sequence analysis showed that the bacterial diversity was abundant, and the sequences belonging to β-Proteobacteria, α-Proteobacteria and Actinobacteria were predominant. The most sequences in the saline-alkali rearing ponds exhibited low similarity with known bacterial 16S rRNA genes, suggesting that these sequences may represent novel bacteria. In addition, the majority of our sequences were most closely affiliated with sequences retrieved from inland waters of China. These results suggest that the saline-alkali ponds rearing common carp are specific ecologic niches and the distribution of the bacteria may be influenced by geographical factors. This study reports the bacterial diversity in saline-alkali ponds rearing common carp by the culture-independent technique for the first time; therefore, it provides important information for understanding the microbial ecology in saline-alkali rearing ponds and managing the microbial community composition to promote and maintain the health of aquaculture environments.


2013 ◽  
Vol 79 (19) ◽  
pp. 5962-5969 ◽  
Author(s):  
Dong-Lei Sun ◽  
Xuan Jiang ◽  
Qinglong L. Wu ◽  
Ning-Yi Zhou

ABSTRACTEver since Carl Woese introduced the use of 16S rRNA genes for determining the phylogenetic relationships of prokaryotes, this method has been regarded as the “gold standard” in both microbial phylogeny and ecology studies. However, intragenomic heterogeneity within 16S rRNA genes has been reported in many investigations and is believed to bias the estimation of prokaryotic diversity. In the current study, 2,013 completely sequenced genomes of bacteria and archaea were analyzed and intragenomic heterogeneity was found in 952 genomes (585 species), with 87.5% of the divergence detected being below the 1% level. In particular, some extremophiles (thermophiles and halophiles) were found to harbor highly divergent 16S rRNA genes. Overestimation caused by 16S rRNA gene intragenomic heterogeneity was evaluated at different levels using the full-length and partial 16S rRNA genes usually chosen as targets for pyrosequencing. The result indicates that, at the unique level, full-length 16S rRNA genes can produce an overestimation of as much as 123.7%, while at the 3% level, an overestimation of 12.9% for the V6 region may be introduced. Further analysis showed that intragenomic heterogeneity tends to concentrate in specific positions, with the V1 and V6 regions suffering the most intragenomic heterogeneity and the V4 and V5 regions suffering the least intragenomic heterogeneity in bacteria. This is the most up-to-date overview of the diversity of 16S rRNA genes within prokaryotic genomes. It not only provides general guidance on how much overestimation can be introduced when applying 16S rRNA gene-based methods, due to its intragenomic heterogeneity, but also recommends that, for bacteria, this overestimation be minimized using primers targeting the V4 and V5 regions.


Sign in / Sign up

Export Citation Format

Share Document