scholarly journals Consistent Metagenome-Derived Metrics Verify and Delineate Bacterial Species Boundaries

mSystems ◽  
2020 ◽  
Vol 5 (1) ◽  
Author(s):  
Matthew R. Olm ◽  
Alexander Crits-Christoph ◽  
Spencer Diamond ◽  
Adi Lavy ◽  
Paula B. Matheus Carnevali ◽  
...  

ABSTRACT Longstanding questions relate to the existence of naturally distinct bacterial species and genetic approaches to distinguish them. Bacterial genomes in public databases form distinct groups, but these databases are subject to isolation and deposition biases. To avoid these biases, we compared 5,203 bacterial genomes from 1,457 environmental metagenomic samples to test for distinct clouds of diversity and evaluated metrics that could be used to define the species boundary. Bacterial genomes from the human gut, soil, and the ocean all exhibited gaps in whole-genome average nucleotide identities (ANI) near the previously suggested species threshold of 95% ANI. While genome-wide ratios of nonsynonymous and synonymous nucleotide differences (dN/dS) decrease until ANI values approach ∼98%, two methods for estimating homologous recombination approached zero at ∼95% ANI, supporting breakdown of recombination due to sequence divergence as a species-forming force. We evaluated 107 genome-based metrics for their ability to distinguish species when full genomes are not recovered. Full-length 16S rRNA genes were least useful, in part because they were underrecovered from metagenomes. However, many ribosomal proteins displayed both high metagenomic recoverability and species discrimination power. Taken together, our results verify the existence of sequence-discrete microbial species in metagenome-derived genomes and highlight the usefulness of ribosomal genes for gene-level species discrimination. IMPORTANCE There is controversy about whether bacterial diversity is clustered into distinct species groups or exists as a continuum. To address this issue, we analyzed bacterial genome databases and reports from several previous large-scale environment studies and identified clear discrete groups of species-level bacterial diversity in all cases. Genetic analysis further revealed that quasi-sexual reproduction via horizontal gene transfer is likely a key evolutionary force that maintains bacterial species integrity. We next benchmarked over 100 metrics to distinguish these bacterial species from each other and identified several genes encoding ribosomal proteins with high species discrimination power. Overall, the results from this study provide best practices for bacterial species delineation based on genome content and insight into the nature of bacterial species population genetics.

2019 ◽  
Author(s):  
Matthew R. Olm ◽  
Alexander Crits-Christoph ◽  
Spencer Diamond ◽  
Adi Lavy ◽  
Paula B. Matheus Carnevali ◽  
...  

AbstractLongstanding questions relate to the existence of naturally distinct bacterial species and genetic approaches to distinguish them. Bacterial genomes in public databases form distinct groups, but these databases are subject to isolation and deposition biases. We compared 5,203 bacterial genomes from 1,457 environmental metagenomic samples to test for distinct clouds of diversity, and evaluated metrics that could be used to define the species boundary. Bacterial genomes from the human gut, soil, and the ocean all exhibited gaps in whole-genome average nucleotide identities (ANI) near the previously suggested species threshold of 95% ANI. While genome-wide ratios of non-synonymous and synonymous nucleotide differences (dN/dS) decrease until ANI values approach ∼98%, estimates for homologous recombination approached zero at ∼95% ANI, supporting breakdown of recombination due to sequence divergence as a species-forming force. We evaluated 107 genome-based metrics for their ability to distinguish species when full genomes are not recovered. Full length 16S rRNA genes were least useful because they were under-recovered from metagenomes, but many ribosomal proteins displayed both high metagenomic recoverability and species-discrimination power. Taken together, our results verify the existence of sequence-discrete microbial species in metagenome-derived genomes and highlight the usefulness of ribosomal genes for gene-level species discrimination.


mSphere ◽  
2020 ◽  
Vol 5 (1) ◽  
Author(s):  
Michelle Spoto ◽  
Changhui Guan ◽  
Elizabeth Fleming ◽  
Julia Oh

ABSTRACT The CRISPR/Cas system has significant potential to facilitate gene editing in a variety of bacterial species. CRISPR interference (CRISPRi) and CRISPR activation (CRISPRa) represent modifications of the CRISPR/Cas9 system utilizing a catalytically inactive Cas9 protein for transcription repression and activation, respectively. While CRISPRi and CRISPRa have tremendous potential to systematically investigate gene function in bacteria, few programs are specifically tailored to identify guides in draft bacterial genomes genomewide. Furthermore, few programs offer open-source code with flexible design parameters for bacterial targeting. To address these limitations, we created GuideFinder, a customizable, user-friendly program that can design guides for any annotated bacterial genome. GuideFinder designs guides from NGG protospacer-adjacent motif (PAM) sites for any number of genes by the use of an annotated genome and FASTA file input by the user. Guides are filtered according to user-defined design parameters and removed if they contain any off-target matches. Iteration with lowered parameter thresholds allows the program to design guides for genes that did not produce guides with the more stringent parameters, one of several features unique to GuideFinder. GuideFinder can also identify paired guides for targeting multiplicity, whose validity we tested experimentally. GuideFinder has been tested on a variety of diverse bacterial genomes, finding guides for 95% of genes on average. Moreover, guides designed by the program are functionally useful—focusing on CRISPRi as a potential application—as demonstrated by essential gene knockdown in two staphylococcal species. Through the large-scale generation of guides, this open-access software will improve accessibility to CRISPR/Cas studies of a variety of bacterial species. IMPORTANCE With the explosion in our understanding of human and environmental microbial diversity, corresponding efforts to understand gene function in these organisms are strongly needed. CRISPR/Cas9 technology has revolutionized interrogation of gene function in a wide variety of model organisms. Efficient CRISPR guide design is required for systematic gene targeting. However, existing tools are not adapted for the broad needs of microbial targeting, which include extraordinary species and subspecies genetic diversity, the overwhelming majority of which is characterized by draft genomes. In addition, flexibility in guide design parameters is important to consider the wide range of factors that can affect guide efficacy, many of which can be species and strain specific. We designed GuideFinder, a customizable, user-friendly program that addresses the limitations of existing software and that can design guides for any annotated bacterial genome with numerous features that facilitate guide design in a wide variety of microorganisms.


2017 ◽  
Author(s):  
Michelle Spoto ◽  
Elizabeth Fleming ◽  
Julia Oh

AbstractBackgroundThe CRISPR/Cas system has significant potential to facilitate gene editing in a variety of bacterial species. CRISPR interference (CRISPRi) and CRISPR activation (CRISPRa) represent modifications of the CRISPR/Cas9 system utilizing a catalytically inactive Cas9 protein for transcription repression or activation, respectively. While CRISPRi and CRISPRa have tremendous potential to systematically investigate gene function in bacteria, no pan-bacterial, genome-wide tools exist for guide discovery. We have created Guide Finder: a customizable, user-friendly program that can design guides for any annotated bacterial genome.ResultsGuide Finder designs guides from NGG PAM sites for any number of genes using an annotated genome and fasta file input by the user. Guides are filtered according to user-defined design parameters and removed if they contain any off-target matches. Iteration with lowered parameter thresholds allows the program to design guides for genes that did not produce guides with the more stringent parameters, a feature unique to Guide Finder. Guide Finder has been tested on a variety of diverse bacterial genomes, on average finding guides for 95% of genes. Moreover, guides designed by the program are functionally useful—focusing on CRISPRi as a potential application—as demonstrated by essential gene knockdown in two staphylococcal species.ConclusionsThrough the large-scale generation of guides, this open-access software will improve accessibility to CRISPR/Cas studies for a variety of bacterial species.


2021 ◽  
Author(s):  
Patrick D. Schloss

AbstractAmplicon sequencing variants (ASVs) have been proposed as an alternative to operational taxonomic units (OTUs) for analyzing microbial communities. ASVs have grown in popularity, in part, because of a desire to reflect a more refined level of taxonomy since they do not cluster sequences based on a distance-based threshold. However, ASVs and the use of overly narrow thresholds to identify OTUs increase the risk of splitting a single genome into separate clusters. To assess this risk, I analyzed the intragenomic variation of 16S rRNA genes from the bacterial genomes represented in a rrn copy number database, which contained 20,427 genomes from 5,972 species. As the number of copies of the 16S rRNA gene increased in a genome, the number of ASVs also increased. There was an average of 0.58 ASVs per copy of the 16S rRNA gene for full length 16S rRNA genes. It was necessary to use a distance threshold of 5.25% to cluster full length ASVs from the same genome into a single OTU with 95% confidence for genomes with 7 copies of the 16S rRNA, such as E. coli. This research highlights the risk of splitting a single bacterial genome into separate clusters when ASVs are used to analyze 16S rRNA gene sequence data. Although there is also a risk of clustering ASVs from different species into the same OTU when using broad distance thresholds, those risks are of less concern than artificially splitting a genome into separate ASVs and OTUs.


2004 ◽  
Vol 186 (9) ◽  
pp. 2629-2635 ◽  
Author(s):  
Silvia G. Acinas ◽  
Luisa A. Marcelino ◽  
Vanja Klepac-Ceraj ◽  
Martin F. Polz

ABSTRACT The level of sequence heterogeneity among rrn operons within genomes determines the accuracy of diversity estimation by 16S rRNA-based methods. Furthermore, the occurrence of widespread horizontal gene transfer (HGT) between distantly related rrn operons casts doubt on reconstructions of phylogenetic relationships. For this study, patterns of distribution of rrn copy numbers, interoperonic divergence, and redundancy of 16S rRNA sequences were evaluated. Bacterial genomes display up to 15 operons and operon numbers up to 7 are commonly found, but ∼40% of the organisms analyzed have either one or two operons. Among the Archaea, a single operon appears to dominate and the highest number of operons is five. About 40% of sequences among 380 operons in 76 bacterial genomes with multiple operons were identical to at least one other 16S rRNA sequence in the same genome, and in 38% of the genomes all 16S rRNAs were invariant. For Archaea, the number of identical operons was only 25%, but only five genomes with 21 operons are currently available. These considerations suggest an upper bound of roughly threefold overestimation of bacterial diversity resulting from cloning and sequencing of 16S rRNA genes from the environment; however, the inclusion of genomes with a single rrn operon may lower this correction factor to ∼2.5. Divergence among operons appears to be small overall for both Bacteria and Archaea, with the vast majority of 16S rRNA sequences showing <1% nucleotide differences. Only five genomes with operons with a higher level of nucleotide divergence were detected, and Thermoanaerobacter tengcongensis exhibited the highest level of divergence (11.6%) noted to date. Overall, four of the five extreme cases of operon differences occurred among thermophilic bacteria, suggesting a much higher incidence of HGT in these bacteria than in other groups.


2016 ◽  
Vol 103 ◽  
pp. 337-348 ◽  
Author(s):  
Kayla N. Burns ◽  
Nicholas A. Bokulich ◽  
Dario Cantu ◽  
Rachel F. Greenhut ◽  
Daniel A. Kluepfel ◽  
...  

2020 ◽  
Author(s):  
Ryan Richard Ruff ◽  
Bidisha Paul ◽  
Maria A Sierra ◽  
Fangxi Xu ◽  
Yasmi Crystal ◽  
...  

AbstractObjectives: Silver diamine fluoride (SDF) is a nonsurgical therapy for the arrest and prevention of dental caries with demonstrated clinical efficacy. Approximately 20% of children receiving SDF fail to respond to treatment. The objective of this study was to develop a predictive model of treatment nonresponse using machine learning. Methods: An observational pilot study (N=20) consisting of children with and without active decay and who did and did not respond to silver diamine fluoride provided salivary samples and plaque from infected and contralateral sites. 16S rRNA genes from samples were amplified and sequenced on an Illumina Miseq and analyzed using QIIME. The association between operational taxonomic units and treatment nonresponse was assessed using lasso regression and artificial neural networks. Results: Bivariate group comparisons of bacterial abundance indicate a number of genera were significantly different between nonresponders and those who responded to SDF therapy. No differences were found between nonresponders and caries-active subjects. Prevotella pallens and Veillonella denticariosi were retained in full lasso models and combined with clinical variables in a six-input multilayer perceptron. Discussion: The acidogenic and acid-tolerant nature of retained bacterial species may overcome the antimicrobial effects of SDF. Further research to validate the model in larger external samples is needed.


2020 ◽  
Author(s):  
Robert A. Petit ◽  
Timothy D. Read

AbstractSequencing of bacterial genomes using Illumina technology has become such a standard procedure that often data are generated faster than can be conveniently analyzed. We created a new series of pipelines called Bactopia, built using Nextflow workflow software, to provide efficient comparative genomic analyses for bacterial species or genera. Bactopia consists of a dataset setup step (Bactopia Datasets; BaDs) where a series of customizable datasets are created for the species of interest; the Bactopia Analysis Pipeline (BaAP), which performs quality control, genome assembly and several other functions based on the available datasets and outputs the processed data to a structured directory format; and a series of Bactopia Tools (BaTs) that perform specific post-processing on some or all of the processed data. BaTs include pan-genome analysis, computing average nucleotide identity between samples, extracting and profiling the 16S genes and taxonomic classification using highly conserved genes. It is expected that the number of BaTs will increase to fill specific applications in the future. As a demonstration, we performed an analysis of 1,664 public Lactobacillus genomes, focusing on L. crispatus, a species that is a common part of the human vaginal microbiome. Bactopia is an open source system that can scale from projects as small as one bacterial genome to thousands that allows for great flexibility in choosing comparison datasets and options for downstream analysis. Bactopia code can be accessed at https://www.github.com/bactopia/bactopia.


2004 ◽  
Vol 70 (11) ◽  
pp. 6767-6775 ◽  
Author(s):  
He-Long Jiang ◽  
Joo-Hwa Tay ◽  
Abdul Majid Maszenan ◽  
Stephen Tiong-Lee Tay

ABSTRACT Aerobic granules are self-immobilized aggregates of microorganisms and represent a relatively new form of cell immobilization developed for biological wastewater treatment. In this study, both culture-based and culture-independent techniques were used to investigate the bacterial diversity and function in aerobic phenol- degrading granules cultivated in a sequencing batch reactor. Denaturing gradient gel electrophoresis (DGGE) analysis of PCR-amplified 16S rRNA genes demonstrated a major shift in the microbial community as the seed sludge developed into granules. Culture isolation and DGGE assays confirmed the dominance of β-Proteobacteria and high-G+C gram-positive bacteria in the phenol-degrading aerobic granules. Of the 10 phenol-degrading bacterial strains isolated from the granules, strains PG-01, PG-02, and PG-08 possessed 16S rRNA gene sequences that matched the partial sequences of dominant bands in the DGGE fingerprint belonging to the aerobic granules. The numerical dominance of strain PG-01 was confirmed by isolation, DGGE, and in situ hybridization with a strain-specific probe, and key physiological traits possessed by PG-01 that allowed it to outcompete and dominate other microorganisms within the granules were then identified. This strain could be regarded as a functionally dominant strain and may have contributed significantly to phenol degradation in the granules. On the other hand, strain PG-08 had low specific growth rate and low phenol degradation ability but showed a high propensity to autoaggregate. By analyzing the roles played by these two isolates within the aerobic granules, a functional model of the microbial community within the aerobic granules was proposed. This model has important implications for rationalizing the engineering of ecological systems.


Sign in / Sign up

Export Citation Format

Share Document