Global genomic similarity and core genome sequence diversity of the Streptococcus genus as a toolkit to identify close related bacterial strains in complex environments

Global genomic similarity and core genome sequence diversity of the Streptococcus genus as a toolkit to identify closely related bacterial species in complex environments.

10.7287/peerj.preprints.26665v2 ◽

2018 ◽

Author(s):

Hugo R Barajas de la Torre ◽

Miguel Romero ◽

Shamayim Martínez-Sánchez ◽

Luis D Alcaraz

Keyword(s):

16S Rrna ◽

16S Rrna Gene ◽

Core Genome ◽

Comparative Genomic ◽

Rrna Gene ◽

Gene Phylogeny ◽

The Core ◽

Sequence Identity ◽

Core Proteins ◽

Genomic Similarity

Background. Comparative genomics between closely related bacterial strains can distinguish important features determining pathogenesis, antibiotic resistance, and phylogenetic structure. The Streptococcus genus is relevant to public health and food safety and it is well-represented (>100 genomes) in databases of publicly available databases. Streptococci are cosmopolitan, with multiple sources of isolation, from humans to dairy products. The Streptococcus genus has been classified by morphology, serotypes, 16S rRNA gene, and Multi Locus Sequence Types (MLST). The Genomic Similarity Score (GSS) is proposed as a tool to quantify genome level relatedness between species of Streptococcus. The Streptococcus core genome can be used to assess strain specific abundances in metagenomic sequences. Methods. A 16S rRNA gene phylogeny was calculated for 108 strains, belonging to 16 Streptococcus species and compared to a dendrogram using GSS pairwise distances for the same genomes. The core and pan-genome were calculated for these 108 genomes. The core genome sequences were analyzed and used as a resource to discriminate homologous fragment reads from closely related strains in metagenomic samples. Results. A total of 404 proteins are shared by all 108 Streptococcus genomes, which is the core genome. The pairwise amino acid identity values of the core proteins for all the compared strains and outgroups are reported. Lower sequence identity variation (90-100%) is predominantly found in core clusters containing ribosomal and translation-related proteins. For 48 core proteins (11.8%) no functional assignment could be made and those proteins have larger sequence identity variations than other core proteins. The sequence identity of the core genome diminishes as GSS score between species decreases. The GSS dendrogram recovers most of the clades in the 16S rRNA gene phylogeny while distinguishing between 16S polytomies (unresolved nodes). Finally, the core genome was used to distinguish between closely related species within human oral metagenomes. Discussion. The Streptococcus genus provides a benchmark dataset for comparative genomic studies due to the breath depth of genomic coverage. Comparing metagenomic shotgun fragment reads to the core genome using rapid alignment tools allows species-specific abundance estimates in metagenomic samples. Understanding of genomic variability and strains relatedness is the goal of tools like GSS, which make use of both pairwise shared core and pan-genomic homologous shared sequences for its calculation.

Download Full-text

Global genomic similarity and core genome sequence diversity of the Streptococcus genus as a toolkit to identify closely related bacterial species in complex environments.

10.7287/peerj.preprints.26665 ◽

2018 ◽

Author(s):

Hugo R Barajas de la Torre ◽

Miguel Romero ◽

Shamayim Martínez-Sánchez ◽

Luis D Alcaraz

Keyword(s):

16S Rrna ◽

16S Rrna Gene ◽

Core Genome ◽

Comparative Genomic ◽

Rrna Gene ◽

Gene Phylogeny ◽

The Core ◽

Sequence Identity ◽

Core Proteins ◽

Genomic Similarity

Background. Comparative genomics between closely related bacterial strains can distinguish important features determining pathogenesis, antibiotic resistance, and phylogenetic structure. The Streptococcus genus is relevant to public health and food safety and it is well-represented (>100 genomes) in databases of publicly available databases. Streptococci are cosmopolitan, with multiple sources of isolation, from humans to dairy products. The Streptococcus genus has been classified by morphology, serotypes, 16S rRNA gene, and Multi Locus Sequence Types (MLST). The Genomic Similarity Score (GSS) is proposed as a tool to quantify genome level relatedness between species of Streptococcus. The Streptococcus core genome can be used to assess strain specific abundances in metagenomic sequences. Methods. A 16S rRNA gene phylogeny was calculated for 108 strains, belonging to 16 Streptococcus species and compared to a dendrogram using GSS pairwise distances for the same genomes. The core and pan-genome were calculated for these 108 genomes. The core genome sequences were analyzed and used as a resource to discriminate homologous fragment reads from closely related strains in metagenomic samples. Results. A total of 404 proteins are shared by all 108 Streptococcus genomes, which is the core genome. The pairwise amino acid identity values of the core proteins for all the compared strains and outgroups are reported. Lower sequence identity variation (90-100%) is predominantly found in core clusters containing ribosomal and translation-related proteins. For 48 core proteins (11.8%) no functional assignment could be made and those proteins have larger sequence identity variations than other core proteins. The sequence identity of the core genome diminishes as GSS score between species decreases. The GSS dendrogram recovers most of the clades in the 16S rRNA gene phylogeny while distinguishing between 16S polytomies (unresolved nodes). Finally, the core genome was used to distinguish between closely related species within human oral metagenomes. Discussion. The Streptococcus genus provides a benchmark dataset for comparative genomic studies due to the breath depth of genomic coverage. Comparing metagenomic shotgun fragment reads to the core genome using rapid alignment tools allows species-specific abundance estimates in metagenomic samples. Understanding of genomic variability and strains relatedness is the goal of tools like GSS, which make use of both pairwise shared core and pan-genomic homologous shared sequences for its calculation.

Download Full-text

Global genomic similarity and core genome sequence diversity of the Streptococcus genus as a toolkit to identify closely related bacterial species in complex environments

PeerJ ◽

10.7717/peerj.6233 ◽

2019 ◽

Vol 6 ◽

pp. e6233 ◽

Cited By ~ 4

Author(s):

Hugo R. Barajas ◽

Miguel F. Romero ◽

Shamayim Martínez-Sánchez ◽

Luis D. Alcaraz

Keyword(s):

16S Rrna ◽

16S Rrna Gene ◽

Core Genome ◽

Bacterial Species ◽

Genomic Diversity ◽

Comparative Genomic ◽

Rrna Gene ◽

Gene Phylogeny ◽

The Core ◽

Core Proteins

Background The Streptococcus genus is relevant to both public health and food safety because of its ability to cause pathogenic infections. It is well-represented (>100 genomes) in publicly available databases. Streptococci are ubiquitous, with multiple sources of isolation, from human pathogens to dairy products. The Streptococcus genus has traditionally been classified by morphology, serum types, the 16S ribosomal RNA (rRNA) gene, and multi-locus sequence types subject to in-depth comparative genomic analysis. Methods Core and pan-genomes described the genomic diversity of 108 strains belonging to 16 Streptococcus species. The core genome nucleotide diversity was calculated and compared to phylogenomic distances within the genus Streptococcus. The core genome was also used as a resource to recruit metagenomic fragment reads from streptococci dominated environments. A conventional 16S rRNA gene phylogeny reconstruction was used as a reference to compare the resulting dendrograms of average nucleotide identity (ANI) and genome similarity score (GSS) dendrograms. Results The core genome, in this work, consists of 404 proteins that are shared by all 108 Streptococcus. The average identity of the pairwise compared core proteins decreases proportionally to GSS lower scores, across species. The GSS dendrogram recovers most of the clades in the 16S rRNA gene phylogeny while distinguishing between 16S polytomies (unresolved nodes). The GSS is a distance metric that can reflect evolutionary history comparing orthologous proteins. Additionally, GSS resulted in the most useful metric for genus and species comparisons, where ANI metrics failed due to false positives when comparing different species. Discussion Understanding of genomic variability and species relatedness is the goal of tools like GSS, which makes use of the maximum pairwise shared orthologous sequences for its calculation. It allows for long evolutionary distances (above species) to be included because of the use of amino acid alignment scores, rather than nucleotides, and normalizing by positive matches. Newly sequenced species and strains could be easily placed into GSS dendrograms to infer overall genomic relatedness. The GSS is not restricted to ubiquitous conservancy of gene features; thus, it reflects the mosaic-structure and dynamism of gene acquisition and loss in bacterial genomes.

Download Full-text

Jinshanibacter allomyrinae sp. nov., isolated from larvae of Allomyrina dichotoma, proposal of Insectihabitans xujianqingii gen. nov., comb. nov. and emended descriptions of the genera Jinshanibacter, Limnobaculum and Pragia

INTERNATIONAL JOURNAL OF SYSTEMATIC AND EVOLUTIONARY MICROBIOLOGY ◽

10.1099/ijsem.0.004938 ◽

2021 ◽

Vol 71 (8) ◽

Author(s):

Soon Dong Lee ◽

Yeong-Sik Byeon ◽

Sung-Min Kim ◽

Hong Lim Yang ◽

In Seop Kim

Keyword(s):

16S Rrna ◽

16S Rrna Gene ◽

Type Species ◽

Sequence Similarity ◽

Rrna Gene ◽

Bacterial Strains ◽

Content Type ◽

Gene Phylogeny ◽

Link Type ◽

Protaetia Brevitarsis Seulensis

Taxonomic positions of four Gram-negative bacterial strains, which were isolated from larvae of two insects in Jeju, Republic of Korea, were determined by a polyphasic approach. Strains CWB-B4, CWB-B41 and CWB-B43 were recovered from larvae of Protaetia brevitarsis seulensis, whereas strain BWR-B9T was from larvae of Allomyrina dichotoma. All the isolates grew at 10–37 °C, at pH 5.0–9.0 and in the presence of 4 % (w/v) NaCl. The 16S rRNA gene phylogeny showed that the four isolates formed two distinct sublines within the order Enterobacteriales and closely associated with members of the genus Jinshanibacter . The first group represented by strain CWB-B4 formed a tight cluster with Jinshanibacter xujianqingii CF-1111T (99.3 % sequence similarity), whereas strain BWR-B9T was most closely related to Jinshanibacter zhutongyuii CF-458T (99.5 % sequence similarity). The 92 core gene analysis showed that the isolates belonged to the family Budviciaceae and supported the clustering shown in 16S rRNA gene phylogeny. The genomic DNA G+C content of the isolates was 45.2 mol%. A combination of overall genomic relatedness and phenotypic distinctness supported that three isolates from Protaetia brevitarsis seulensis are different strains of Jinshanibacter xujianqingii , whereas one isolate from Allomyrina dichotoma represents a new species of the genus Jinshanibacter . On the basis of results obtained here, Jinshanibacter allomyrinae sp. nov. (type strain BWR-B9T=KACC 22153T=NBRC 114879T) and Insectihabitans xujianqingii gen. nov., comb. nov. are proposed, with the emended descriptions of the genera Jinshanibacter , Limnobaculum and Pragia .

Download Full-text

The mutL Gene as a Genome-Wide Taxonomic Marker for High Resolution Discrimination of Lactiplantibacillus plantarum and Its Closely Related Taxa

Microorganisms ◽

10.3390/microorganisms9081570 ◽

2021 ◽

Vol 9 (8) ◽

pp. 1570

Author(s):

Chien-Hsun Huang ◽

Chih-Chieh Chen ◽

Yu-Chun Lin ◽

Chia-Hsuan Chen ◽

Ai-Yun Lee ◽

...

Keyword(s):

16S Rrna ◽

16S Rrna Gene ◽

Target Genes ◽

Marker Genes ◽

Rrna Gene ◽

Accurate Identification ◽

Discrimination Power ◽

Sequence Identity ◽

Genome Wide ◽

A Genome

The current taxonomy of the Lactiplantibacillus plantarum group comprises of 17 closely related species that are indistinguishable from each other by using commonly used 16S rRNA gene sequencing. In this study, a whole-genome-based analysis was carried out for exploring the highly distinguished target genes whose interspecific sequence identity is significantly less than those of 16S rRNA or conventional housekeeping genes. In silico analyses of 774 core genes by the cano-wgMLST_BacCompare analytics platform indicated that csbB, morA, murI, mutL, ntpJ, rutB, trmK, ydaF, and yhhX genes were the most promising candidates. Subsequently, the mutL gene was selected, and the discrimination power was further evaluated using Sanger sequencing. Among the type strains, mutL exhibited a clearly superior sequence identity (61.6–85.6%; average: 66.6%) to the 16S rRNA gene (96.7–100%; average: 98.4%) and the conventional phylogenetic marker genes (e.g., dnaJ, dnaK, pheS, recA, and rpoA), respectively, which could be used to separat tested strains into various species clusters. Consequently, species-specific primers were developed for fast and accurate identification of L. pentosus, L. argentoratensis, L. plantarum, and L. paraplantarum. During this study, one strain (BCRC 06B0048, L. pentosus) exhibited not only relatively low mutL sequence identities (97.0%) but also a low digital DNA–DNA hybridization value (78.1%) with the type strain DSM 20314T, signifying that it exhibits potential for reclassification as a novel subspecies. Our data demonstrate that mutL can be a genome-wide target for identifying and classifying the L. plantarum group species and for differentiating novel taxa from known species.

Download Full-text

Microbiome of Odontogenic Abscesses

Microorganisms ◽

10.3390/microorganisms9061307 ◽

2021 ◽

Vol 9 (6) ◽

pp. 1307

Author(s):

Sebastian Böttger ◽

Silke Zechel-Gran ◽

Daniel Schmermund ◽

Philipp Streckbein ◽

Jan-Falco Wilbrand ◽

...

Keyword(s):

16S Rrna ◽

16S Rrna Gene ◽

Oral Microbiome ◽

Minor Role ◽

Rrna Gene ◽

Sequencing Analysis ◽

Bacterial Strains ◽

16S Rrna Gene Analysis ◽

Odontogenic Infections ◽

Odontogenic Abscess

Severe odontogenic abscesses are regularly caused by bacteria of the physiological oral microbiome. However, the culture of these bacteria is often prone to errors and sometimes does not result in any bacterial growth. Furthermore, various authors found completely different bacterial spectra in odontogenic abscesses. Experimental 16S rRNA gene next-generation sequencing analysis was used to identify the microbiome of the saliva and the pus in patients with a severe odontogenic infection. The microbiome of the saliva and the pus was determined for 50 patients with a severe odontogenic abscess. Perimandibular and submandibular abscesses were the most commonly observed diseases at 15 (30%) patients each. Polymicrobial infections were observed in 48 (96%) cases, while the picture of a mono-infection only occurred twice (4%). On average, 31.44 (±12.09) bacterial genera were detected in the pus and 41.32 (±9.00) in the saliva. In most cases, a predominantly anaerobic bacterial spectrum was found in the pus, while saliva showed a similar oral microbiome to healthy individuals. In the majority of cases, odontogenic infections are polymicrobial. Our results indicate that these are mainly caused by anaerobic bacterial strains and that aerobic and facultative anaerobe bacteria seem to play a more minor role than previously described by other authors. The 16S rRNA gene analysis detects significantly more bacteria than conventional methods and molecular methods should therefore become a part of routine diagnostics in medical microbiology.

Download Full-text

How well does 16S rRNA gene phylogeny represent evolutionary relationships among the rhizobia?

Nitrogen fixation: global perspectives. Proceedings of the 13th International Congress on Nitrogen Fixation, Hamilton, Ontario, Canada, 2-7 July 2001 ◽

10.1079/9780851995915.0071 ◽

2009 ◽

pp. 71-74 ◽

Cited By ~ 1

Author(s):

P. van Berkum ◽

S. Reiner ◽

B. D. Eardly

Keyword(s):

16S Rrna ◽

16S Rrna Gene ◽

Evolutionary Relationships ◽

Rrna Gene ◽

Gene Phylogeny

Download Full-text

Inorganic Phosphate Solubilization by a Novel Isolated Bacterial Strain Enterobacter sp. ITCB-09 and Its Application Potential as Biofertilizer

Agriculture ◽

10.3390/agriculture10090383 ◽

2020 ◽

Vol 10 (9) ◽

pp. 383 ◽

Cited By ~ 3

Author(s):

Gustavo Enrique Mendoza-Arroyo ◽

Manuel Jesús Chan-Bacab ◽

Ruth Noemi Aguila-Ramírez ◽

Benjamín Otto Ortega-Morales ◽

René Efraín Canché Solís ◽

...

Keyword(s):

16S Rrna ◽

16S Rrna Gene ◽

Inorganic Phosphate ◽

Extracellular Polymeric Substances ◽

Phosphate Solubilization ◽

Rrna Gene ◽

Bacterial Strains ◽

Habanero Pepper ◽

Phosphate Solubilizing

The excessive use of fertilizers in agriculture is mainly due to the recognized plant requirements for soluble phosphorus. This problem has limited the implementation of sustainable agriculture. A viable alternative is to use phosphate solubilizing soil microorganisms. This work aimed to isolate inorganic phosphorus-solubilizing bacteria from the soils of agroecosystems, to select and identify, based on sequencing and phylogenetic analysis of the 16S rRNA gene, the bacterium with the highest capacity for in vitro solubilization of inorganic phosphate. Additionally, we aimed to determine its primary phosphate solubilizing mechanisms and to evaluate its effect on Habanero pepper seedlings growth. A total of 21 bacterial strains were isolated by their activity on Pikovskaya agar. Of these, strain ITCB-09 exhibited the highest ability to solubilize inorganic phosphate (865.98 µg/mL) through the production of organic acids. This strain produced extracellular polymeric substances and siderophores that have ecological implications for phosphate solubilization. 16S rRNA gene sequence analysis revealed that strain ITCB-09 belongs to the genus Enterobacter. Enterobacter sp. ITCB-09, especially when immobilized in beads, had a positive effect on Capsicum chinense Jacq. seedling growth, indicating its potential as a biofertilizer.

Download Full-text

Molecular Characterization of Potential Nitrogen Fixation by Anaerobic Methane-Oxidizing Archaea in the Methane Seep Sediments at the Number 8 Kumano Knoll in the Kumano Basin, Offshore of Japan

Applied and Environmental Microbiology ◽

10.1128/aem.01184-09 ◽

2009 ◽

Vol 75 (22) ◽

pp. 7153-7162 ◽

Cited By ~ 33

Author(s):

Junichi Miyazaki ◽

Ryosaku Higa ◽

Tomohiro Toki ◽

Juichiro Ashi ◽

Urumu Tsunogai ◽

...

Keyword(s):

Nitrogen Fixation ◽

16S Rrna ◽

16S Rrna Gene ◽

Gene Clusters ◽

Rrna Gene ◽

Methane Seep ◽

Core Sediments ◽

The Core ◽

Group 2

ABSTRACT The potential for microbial nitrogen fixation in the anoxic methane seep sediments in a mud volcano, the number 8 Kumano Knoll, was characterized by molecular phylogenetic analyses. A total of 111 of the nifH (a gene coding a nitrogen fixation enzyme, Fe protein) clones were obtained from different depths of the core sediments, and the phylogenetic analysis of the clones indicated the genetic diversity of nifH genes. The predominant group detected (methane seep group 2), representing 74% of clonal abundance, was phylogenetically related to the nifH sequences obtained from the Methanosarcina species but was most closely related to the nifH sequences potentially derived from the anoxic methanotrophic archaea (ANME-2 archaea). The recovery of the nif gene clusters including the nifH sequences of the methane seep group 2 and the subsequent reverse transcription-PCR detection of the nifD and nifH genes strongly suggested that the genetic components of the gene clusters would be operative for the in situ assimilation of molecular nitrogen (N2) by the host microorganisms. DNA-based quantitative PCR of the archaeal 16S rRNA gene, the group-specific mcrA (a gene encoding the methyl-coenzyme M reductase α subunit) gene, and the nifD and nifH genes demonstrated the similar distribution patterns of the archaeal 16S rRNA gene, the mcrA groups c-d and e, and the nifD and nifH genes through the core sediments. These results supported the idea that the anoxic methanotrophic archaea ANME-2c could be the microorganisms hosting the nif gene clusters and could play an important role in not only the in situ carbon (methane) cycle but also the nitrogen cycle in subseafloor sediments.

Download Full-text

Pedobacter lentus sp. nov. and Pedobacter terricola sp. nov., isolated from soil

INTERNATIONAL JOURNAL OF SYSTEMATIC AND EVOLUTIONARY MICROBIOLOGY ◽

10.1099/ijs.0.65146-0 ◽

2007 ◽

Vol 57 (9) ◽

pp. 2089-2095 ◽

Cited By ~ 34

Author(s):

Jung-Hoon Yoon ◽

So-Jung Kang ◽

Sooyeon Park ◽

Tae-Kwang Oh

Keyword(s):

16S Rrna ◽

16S Rrna Gene ◽

Gene Sequence ◽

Sequence Similarity ◽

Phylogenetic Analyses ◽

16S Rrna Gene Sequence ◽

Rrna Gene ◽

Rrna Gene Sequence ◽

Bacterial Strains ◽

Type Strains

Two Gram-negative, non-motile, pleomorphic bacterial strains, DS-40T and DS-45T, were isolated from a soil sample collected from Dokdo, Korea, and their exact taxonomic positions were investigated by using a polyphasic approach. Strains DS-40T and DS-45T grew optimally at 25 °C and pH 6.5–7.5 in the presence of 0–1.0 % (w/v) NaCl. They contained MK-7 as the predominant menaquinone and possessed iso-C15 : 0, iso-C17 : 0 3-OH and summed feature 3 (C16 : 1 ω7c and/or iso-C15 : 0 2-OH) as the major fatty acids. The DNA G+C contents of strains DS-40T and DS-45T were 36.0 and 36.8 mol%, respectively. Strains DS-40T and DS-45T shared a 16S rRNA gene sequence similarity of 96.7 % and demonstrated a mean DNA–DNA relatedness level of 12 %. Phylogenetic analyses based on 16S rRNA gene sequences revealed that strains DS-40T and DS-45T were most closely phylogenetically affiliated with the genus Pedobacter of the family Sphingobacteriaceae. Strains DS-40T and DS-45T exhibited 16S rRNA gene sequence similarity values of 91.4–93.7 and 89.9–91.6 % with respect to the type strains of Pedobacter and Sphingobacterium species, respectively. Phenotypic and chemotaxonomic properties, together with the phylogenetic data, support the assignment of strains DS-40T and DS-45T as two distinct species within the genus Pedobacter. On the basis of phenotypic, phylogenetic and genetic data, strains DS-40T and DS-45T represent two novel species of the genus Pedobacter, for which the names Pedobacter lentus sp. nov. and Pedobacter terricola sp. nov. are proposed, respectively. The respective type strains are DS-40T (=KCTC 12875T=JCM 14593T) and DS-45T (=KCTC 12876T=JCM 14594T).

Download Full-text