scholarly journals Global genomic similarity and core genome sequence diversity of the Streptococcus genus as a toolkit to identify closely related bacterial species in complex environments

PeerJ ◽  
2019 ◽  
Vol 6 ◽  
pp. e6233 ◽  
Author(s):  
Hugo R. Barajas ◽  
Miguel F. Romero ◽  
Shamayim Martínez-Sánchez ◽  
Luis D. Alcaraz

Background The Streptococcus genus is relevant to both public health and food safety because of its ability to cause pathogenic infections. It is well-represented (>100 genomes) in publicly available databases. Streptococci are ubiquitous, with multiple sources of isolation, from human pathogens to dairy products. The Streptococcus genus has traditionally been classified by morphology, serum types, the 16S ribosomal RNA (rRNA) gene, and multi-locus sequence types subject to in-depth comparative genomic analysis. Methods Core and pan-genomes described the genomic diversity of 108 strains belonging to 16 Streptococcus species. The core genome nucleotide diversity was calculated and compared to phylogenomic distances within the genus Streptococcus. The core genome was also used as a resource to recruit metagenomic fragment reads from streptococci dominated environments. A conventional 16S rRNA gene phylogeny reconstruction was used as a reference to compare the resulting dendrograms of average nucleotide identity (ANI) and genome similarity score (GSS) dendrograms. Results The core genome, in this work, consists of 404 proteins that are shared by all 108 Streptococcus. The average identity of the pairwise compared core proteins decreases proportionally to GSS lower scores, across species. The GSS dendrogram recovers most of the clades in the 16S rRNA gene phylogeny while distinguishing between 16S polytomies (unresolved nodes). The GSS is a distance metric that can reflect evolutionary history comparing orthologous proteins. Additionally, GSS resulted in the most useful metric for genus and species comparisons, where ANI metrics failed due to false positives when comparing different species. Discussion Understanding of genomic variability and species relatedness is the goal of tools like GSS, which makes use of the maximum pairwise shared orthologous sequences for its calculation. It allows for long evolutionary distances (above species) to be included because of the use of amino acid alignment scores, rather than nucleotides, and normalizing by positive matches. Newly sequenced species and strains could be easily placed into GSS dendrograms to infer overall genomic relatedness. The GSS is not restricted to ubiquitous conservancy of gene features; thus, it reflects the mosaic-structure and dynamism of gene acquisition and loss in bacterial genomes.

2018 ◽  
Author(s):  
Hugo R Barajas de la Torre ◽  
Miguel Romero ◽  
Shamayim Martínez-Sánchez ◽  
Luis D Alcaraz

Background. Comparative genomics between closely related bacterial strains can distinguish important features determining pathogenesis, antibiotic resistance, and phylogenetic structure. The Streptococcus genus is relevant to public health and food safety and it is well-represented (>100 genomes) in databases of publicly available databases. Streptococci are cosmopolitan, with multiple sources of isolation, from humans to dairy products. The Streptococcus genus has been classified by morphology, serotypes, 16S rRNA gene, and Multi Locus Sequence Types (MLST). The Genomic Similarity Score (GSS) is proposed as a tool to quantify genome level relatedness between species of Streptococcus. The Streptococcus core genome can be used to assess strain specific abundances in metagenomic sequences. Methods. A 16S rRNA gene phylogeny was calculated for 108 strains, belonging to 16 Streptococcus species and compared to a dendrogram using GSS pairwise distances for the same genomes. The core and pan-genome were calculated for these 108 genomes. The core genome sequences were analyzed and used as a resource to discriminate homologous fragment reads from closely related strains in metagenomic samples. Results. A total of 404 proteins are shared by all 108 Streptococcus genomes, which is the core genome. The pairwise amino acid identity values of the core proteins for all the compared strains and outgroups are reported. Lower sequence identity variation (90-100%) is predominantly found in core clusters containing ribosomal and translation-related proteins. For 48 core proteins (11.8%) no functional assignment could be made and those proteins have larger sequence identity variations than other core proteins. The sequence identity of the core genome diminishes as GSS score between species decreases. The GSS dendrogram recovers most of the clades in the 16S rRNA gene phylogeny while distinguishing between 16S polytomies (unresolved nodes). Finally, the core genome was used to distinguish between closely related species within human oral metagenomes. Discussion. The Streptococcus genus provides a benchmark dataset for comparative genomic studies due to the breath depth of genomic coverage. Comparing metagenomic shotgun fragment reads to the core genome using rapid alignment tools allows species-specific abundance estimates in metagenomic samples. Understanding of genomic variability and strains relatedness is the goal of tools like GSS, which make use of both pairwise shared core and pan-genomic homologous shared sequences for its calculation.


2018 ◽  
Author(s):  
Hugo R Barajas de la Torre ◽  
Miguel Romero ◽  
Shamayim Martínez-Sánchez ◽  
Luis D Alcaraz

Background. Comparative genomics between closely related bacterial strains can distinguish important features determining pathogenesis, antibiotic resistance, and phylogenetic structure. The Streptococcus genus is relevant to public health and food safety and it is well-represented (>100 genomes) in databases of publicly available databases. Streptococci are cosmopolitan, with multiple sources of isolation, from humans to dairy products. The Streptococcus genus has been classified by morphology, serotypes, 16S rRNA gene, and Multi Locus Sequence Types (MLST). The Genomic Similarity Score (GSS) is proposed as a tool to quantify genome level relatedness between species of Streptococcus. The Streptococcus core genome can be used to assess strain specific abundances in metagenomic sequences. Methods. A 16S rRNA gene phylogeny was calculated for 108 strains, belonging to 16 Streptococcus species and compared to a dendrogram using GSS pairwise distances for the same genomes. The core and pan-genome were calculated for these 108 genomes. The core genome sequences were analyzed and used as a resource to discriminate homologous fragment reads from closely related strains in metagenomic samples. Results. A total of 404 proteins are shared by all 108 Streptococcus genomes, which is the core genome. The pairwise amino acid identity values of the core proteins for all the compared strains and outgroups are reported. Lower sequence identity variation (90-100%) is predominantly found in core clusters containing ribosomal and translation-related proteins. For 48 core proteins (11.8%) no functional assignment could be made and those proteins have larger sequence identity variations than other core proteins. The sequence identity of the core genome diminishes as GSS score between species decreases. The GSS dendrogram recovers most of the clades in the 16S rRNA gene phylogeny while distinguishing between 16S polytomies (unresolved nodes). Finally, the core genome was used to distinguish between closely related species within human oral metagenomes. Discussion. The Streptococcus genus provides a benchmark dataset for comparative genomic studies due to the breath depth of genomic coverage. Comparing metagenomic shotgun fragment reads to the core genome using rapid alignment tools allows species-specific abundance estimates in metagenomic samples. Understanding of genomic variability and strains relatedness is the goal of tools like GSS, which make use of both pairwise shared core and pan-genomic homologous shared sequences for its calculation.


2018 ◽  
Author(s):  
Hugo R Barajas de la Torre ◽  
Miguel Romero ◽  
Shamayim Martínez-Sánchez ◽  
Luis D Alcaraz

Background. Comparative genomics between closely related bacterial strains aids to distinguish important features like pathogenesis, antibiotic resistance, and phylogenetic structure. Streptococcus is relevant because public health and food safety and it are well-represented (>100 genomes ) in databases of publicly available databases. Streptococci are cosmopolitan, and there are multiple sources of isolation, from humans to dairy products. The Streptococcus have been classified by morphology, serum types, 16S rRNA gene, and Multi Locus Sequence Types (MLST). The Genomic Similarity Score (GSS) is proposed as a tool to quantify genome level relatedness between Streptococcus and using their core genome as a simplified tool to assess strain specific abundances in metagenomic sequences. Methods. A 16S rRNA gene phylogeny has been calculated for 108 strains, belonging to 16 Streptococcus species and compared the results to a dendrogram using the GSS with all homologous shared information available in the genomes. Additionally, genus core and pan-genome were calculated. The core genome sequences identity was analyzed and the core genome was used as a seed to discriminate abundances between close related strains in metagenomic samples. Results. A total of 404 proteins are shared by all 108 Streptococcus genomes, which are the core genome. The core identity values ranges across all the compared strains and outgroups are reported. Lower sequence identity variation (90-100%) within the core belongs to ribosomal and translation-related proteins. It was found out that 48 proteins (11.8%) of the core genome are considered a hypothetical protein and those proteins host the larger sequence identity variations within the core. The sequence identity of the core genome identity diminishes as GSS score between species increases. The GSS dendrogram recovers most of the clades in the 16S rRNA gene phylogeny with the advantage to distinguish between 16S polytomies (unresolved nodes). Finally, our proposed core genome was used to distinguish the abundances of close related strains within human oral metagenomes being able to get strain relative abundances between healthy and caries infected (with S. mutans) individuals. Discussion. The clinical and food safety importance of Streptococcus genus gives a playground to test multiple comparative genomic scenarios due to its excellent genomic coverage. Understanding of genomic variability and strains relatedness is the goal of tools like GSS, which make use of both pairwise shared core and pan-genomic homologous shared sequences for its calculation. Combination of core genome and rapid alignment tools allows to estimate abundance and discriminate in a strain-specific manner in metagenomic samples. Here it is shared with the community both GSS genomic dendrogram and core genome to explore possibilities within streptococci.


2008 ◽  
Vol 74 (13) ◽  
pp. 3969-3976 ◽  
Author(s):  
Jingrang Lu ◽  
Jorge W. Santo Domingo ◽  
Regina Lamendella ◽  
Thomas Edge ◽  
Stephen Hill

ABSTRACT In spite of increasing public health concerns about the potential risks associated with swimming in waters contaminated with waterfowl feces, little is known about the composition of the gut microbial community of aquatic birds. To address this, a gull 16S rRNA gene clone library was developed and analyzed to determine the identities of fecal bacteria. Analysis of 282 16S rRNA gene clones demonstrated that the gull gut bacterial community is mostly composed of populations closely related to Bacilli (37%), Clostridia (17%), Gammaproteobacteria (11%), and Bacteriodetes (1%). Interestingly, a considerable number of sequences (i.e., 26%) were closely related to Catellicoccus marimammalium, a gram-positive, catalase-negative bacterium. To determine the occurrence of C. marimammalium in waterfowl, species-specific 16S rRNA gene PCR and real-time assays were developed and used to test fecal DNA extracts from different bird (n = 13) and mammal (n = 26) species. The results showed that both assays were specific to gull fecal DNA and that C. marimammalium was present in gull fecal samples collected from the five locations in North America (California, Georgia, Ohio, Wisconsin, and Toronto, Canada) tested. Additionally, 48 DNA extracts from waters collected from six sites in southern California, Great Lakes in Michigan, Lake Erie in Ohio, and Lake Ontario in Canada presumed to be impacted with gull feces were positive by the C. marimammalium assay. Due to the widespread presence of this species in gulls and environmental waters contaminated with gull feces, targeting this bacterial species might be useful for detecting gull fecal contamination in waterfowl-impacted waters.


Author(s):  
Jun-Jie Ying ◽  
Zhi-Cheng Wu ◽  
Yuan-Chun Fang ◽  
Lin Xu ◽  
Cong Sun

Parvularcula flava was proposed as a novel member of genus Parvularcula in 2016. Some time earlier, Aquisalinus flavus has been proposed as a novel species of a novel genus named Aquisalinus . When comparing the 16S rRNA gene sequences of type strains P. flava NH6-79T and A. flavus D11M-2T, they showed 97.9 % sequence identity, much higher than the sequence identities 92.7–94.3 % between P. flava NH6-79T and type strains in the genus Parvularcula , indicating that the later proposed novel taxon Parvularcula flava need reclassification. The phylogenetic trees based on 16S rRNA gene sequences and genome sequences both showed that P. flava NH6-79T and A. flavus D11M-2T formed a separated branch away from strains in the genera Parvularcula , Marinicaulis and Amphiplicatus . The average amino acid identity and average nucleotide identity values of P. flava NH6-79T and A. flavus D11M-2T were 87.9 and 85.0 %, respectively, much higher than the values between P. flava NH6-79T and other closely related type strains (54.3 %–58.1 % and 68.6–70.4 %, respectively). P. flava NH6-79T and A. flavus D11M-2T also contained summed feature 8 (C18 : 1  ω6c and/or C18 : 1  ω7c) and C16 : 0 as major fatty acids, distinguishing them from other closely related taxa. Based on the results of the phylogenetic, comparative genomic and phenotypic analyses, Parvularcula flava should be reclassified as Aquisalinus luteolus nom. nov. and the description of genus Aquisalinus is emended.


2010 ◽  
Vol 56 (12) ◽  
pp. 1040-1049 ◽  
Author(s):  
Michal Slany ◽  
Martina Vanerkova ◽  
Eva Nemcova ◽  
Barbora Zaloudikova ◽  
Filip Ruzicka ◽  
...  

High-resolution melting analysis (HRMA) is a fast (post-PCR) high-throughput method to scan for sequence variations in a target gene. The aim of this study was to test the potential of HRMA to distinguish particular bacterial species of the Staphylococcus genus even when using a broad-range PCR within the 16S rRNA gene where sequence differences are minimal. Genomic DNA samples isolated from 12 reference staphylococcal strains ( Staphylococcus aureus , Staphylococcus capitis , Staphylococcus caprae , Staphylococcus epidermidis , Staphylococcus haemolyticus , Staphylococcus hominis , Staphylococcus intermedius , Staphylococcus saprophyticus , Staphylococcus sciuri , Staphylococcus simulans , Staphylococcus warneri , and Staphylococcus xylosus ) were subjected to a real-time PCR amplification of the 16S rRNA gene in the presence of fluorescent dye EvaGreen™, followed by HRMA. Melting profiles were used as molecular fingerprints for bacterial species differentiation. HRMA of S. saprophyticus and S. xylosus resulted in undistinguishable profiles because of their identical sequences in the analyzed 16S rRNA region. The remaining reference strains were fully differentiated either directly or via high-resolution plots obtained by heteroduplex formation between coamplified PCR products of the tested staphylococcal strain and phylogenetically unrelated strain.


Author(s):  
Chen Zheng-li ◽  
Peng Yu ◽  
Wu Guo-sheng ◽  
Hong Xu-Dong ◽  
Fan Hao ◽  
...  

Abstract Burns destroy the skin barrier and alter the resident bacterial community, thereby facilitating bacterial infection. To treat a wound infection, it is necessary to understand the changes in the wound bacterial community structure. However, traditional bacterial cultures allow the identification of only readily growing or purposely cultured bacterial species and lack the capacity to detect changes in the bacterial community. In this study, 16S rRNA gene sequencing was used to detect alterations in the bacterial community structure in deep partial-thickness burn wounds on the back of Sprague-Dawley rats. These results were then compared with those obtained from the bacterial culture. Bacterial samples were collected prior to wounding and 1, 7, 14, and 21 days after wounding. The 16S rRNA gene sequence analysis showed that the number of resident bacterial species decreased after the burn. Both resident bacterial richness and diversity, which were significantly reduced after the burn, recovered following wound healing. The dominant resident strains also changed, but the inhibition of bacterial community structure was in a non-volatile equilibrium state, even in the early stage after healing. Furthermore, the correlation between wound and environmental bacteria increased with the occurrence of burns. Hence, the 16S rRNA gene sequence analysis reflected the bacterial condition of the wounds better than the bacterial culture. 16S rRNA sequencing in the Sprague-Dawley rat burn model can provide more information for the prevention and treatment of burn infections in clinical settings and promote further development in this field.


Plant Disease ◽  
2021 ◽  
Author(s):  
Qi Wei ◽  
Jie Li ◽  
Shuai Yang ◽  
Wenzhong Wang ◽  
Fanxiang Min ◽  
...  

Common scab (CS) caused by Streptomyces spp. is a significant soilborne potato disease that results in tremendous economic losses globally. Identification of CS-associated species of the genus Streptomyces can enhance understanding of the genetic variation of these bacterial species and is necessary for the control of this epidemic disease. The present study isolated Streptomyces strain 6-2-1(1) from scabby potatoes in Keshan County, Heilongjiang Province, China. PCR analysis confirmed that the strain harbored the characteristic Streptomyces pathogenicity island (PAI) genes (txtA, txtAB, nec1, and tomA). Pathogenicity assays proved that the strain caused typical scab lesions on potato tuber surfaces and necrosis on radish seedlings and potato slices. Subsequently, the strain was systemically characterized at morphological, physiological, biochemical and phylogenetic levels. Phylogenetic analysis based on 16S rRNA gene sequences revealed that strain 6-2-1(1) shared 99.86% sequence similarity with Streptomyces rhizophilus JR-41T, isolated initially from bamboo in rhizospheric soil in Korea. PCR amplification followed by Sanger sequencing of the 16S rRNA gene of 164 scabby potato samples collected in Heilongjiang Province from 2019 to 2020 demonstrated that approximately 2% of the tested samples were infected with S. rhizophilus. Taken together, these results demonstrate that S. rhizophilus is capable of causing potato CS disease and may pose a potential challenge to potato production in Heilongjiang Province of China.


Author(s):  
Soon Dong Lee ◽  
Yeong-Sik Byeon ◽  
Sung-Min Kim ◽  
Hong Lim Yang ◽  
In Seop Kim

Taxonomic positions of four Gram-negative bacterial strains, which were isolated from larvae of two insects in Jeju, Republic of Korea, were determined by a polyphasic approach. Strains CWB-B4, CWB-B41 and CWB-B43 were recovered from larvae of Protaetia brevitarsis seulensis, whereas strain BWR-B9T was from larvae of Allomyrina dichotoma. All the isolates grew at 10–37 °C, at pH 5.0–9.0 and in the presence of 4 % (w/v) NaCl. The 16S rRNA gene phylogeny showed that the four isolates formed two distinct sublines within the order Enterobacteriales and closely associated with members of the genus Jinshanibacter . The first group represented by strain CWB-B4 formed a tight cluster with Jinshanibacter xujianqingii CF-1111T (99.3 % sequence similarity), whereas strain BWR-B9T was most closely related to Jinshanibacter zhutongyuii CF-458T (99.5 % sequence similarity). The 92 core gene analysis showed that the isolates belonged to the family Budviciaceae and supported the clustering shown in 16S rRNA gene phylogeny. The genomic DNA G+C content of the isolates was 45.2 mol%. A combination of overall genomic relatedness and phenotypic distinctness supported that three isolates from Protaetia brevitarsis seulensis are different strains of Jinshanibacter xujianqingii , whereas one isolate from Allomyrina dichotoma represents a new species of the genus Jinshanibacter . On the basis of results obtained here, Jinshanibacter allomyrinae sp. nov. (type strain BWR-B9T=KACC 22153T=NBRC 114879T) and Insectihabitans xujianqingii gen. nov., comb. nov. are proposed, with the emended descriptions of the genera Jinshanibacter , Limnobaculum and Pragia .


Sign in / Sign up

Export Citation Format

Share Document