DNA–DNA hybridization values and their relationship to whole-genome sequence similarities

DNA–DNA hybridization (DDH) values have been used by bacterial taxonomists since the 1960s to determine relatedness between strains and are still the most important criterion in the delineation of bacterial species. Since the extent of hybridization between a pair of strains is ultimately governed by their respective genomic sequences, we examined the quantitative relationship between DDH values and genome sequence-derived parameters, such as the average nucleotide identity (ANI) of common genes and the percentage of conserved DNA. A total of 124 DDH values were determined for 28 strains for which genome sequences were available. The strains belong to six important and diverse groups of bacteria for which the intra-group 16S rRNA gene sequence identity was greater than 94 %. The results revealed a close relationship between DDH values and ANI and between DNA–DNA hybridization and the percentage of conserved DNA for each pair of strains. The recommended cut-off point of 70 % DDH for species delineation corresponded to 95 % ANI and 69 % conserved DNA. When the analysis was restricted to the protein-coding portion of the genome, 70 % DDH corresponded to 85 % conserved genes for a pair of strains. These results reveal extensive gene diversity within the current concept of ‘species’. Examination of reciprocal values indicated that the level of experimental error associated with the DDH method is too high to reveal the subtle differences in genome size among the strains sampled. It is concluded that ANI can accurately replace DDH values for strains for which genome sequences are available.

Download Full-text

Comparative genome analysis and phylogenomic of Xanthomonas citri pv. viticola lead to new taxonomic insights about species of Xanthomonas

10.21203/rs.2.22059/v1 ◽

2020 ◽

Author(s):

Antonio Roberto Gomes de Farias ◽

Wilson José da Silva Junior ◽

José Bandeira do Nascimento Junior ◽

Valdir de Queiroz Balbino ◽

Ana Maria Benko-Iseppon ◽

...

Keyword(s):

Genome Sequence ◽

Dna Hybridization ◽

In Silico ◽

Whole Genome Sequence ◽

Nucleotide Identity ◽

Whole Genome ◽

Average Nucleotide Identity ◽

Xanthomonas Citri ◽

Genome Sequences ◽

Xanthomonas Species

Abstract Background Xanthomonas citri pv. viticola is one of the most critical grapevine diseases in the Northeast of Brazil, presenting a high risk to Brazilian and worldwide areas of grape production. The X.citri pv. viticola epithet was recently proposed to be changed from X. campestris pv. v iticola based on multilocus sequence analysis and whole-genome sequences. Besides, genomics has revolutionized the field of bacteriology, by associating genome sequencing with comparative analysis such as in silico analysis such as DNA-DNA hybridization, average nucleotide identity, distance between genomes, pan-genomic approach, and phylogenomic, providing valuable insights and knowledge about virulence factors and contributing to increase the understanding and clarifying the taxonomic relationship of Xanthomonas and others prokaryotic species.Results We used the whole-genome sequence of three Brazilian strains and the pathotype to characterize X.citri pv. viticola accessions plus 124 whole-genome sequences of Xanthomonas species available in NCBI, comprising 13 species and 15 pathovars. The whole-genome sequence structure of X. citri pv. viticola was shown presents a high level of conservation concerning other X. citri species. Pan-genomic approaches, average nucleotide identity analysis, and in silico DNA-DNA hybridization were carried out, allowing X.citri pv. viticola characterization and inferences on the phylogenetic relationships within Xanthomonas . The analysis of the sequence of the 128 genomes clustered the Xanthomonas strains in eight main groups according to the recently proposed classification in all approaches used. Also, the analysis revealed that X. hortorum and X. gardneri should be classified as a single species, and the strain 17 of X. campestris and XC01 of X. citri pv. mangiferaeindicae widely described in the literature are misclassified.Conclusions We performed the genomic characterization of three representative Brazilian strains of Xcv . The genomic approaches based in the pan-genome, average nucleotide identity, and in silico DNA-DNA hybridization support the proposed taxonomic position of X.citri pv. viticola and of the recently proposed Xanthomonas species and pathovars. In addition, we detected species delimitation of the misclassified Xanthomonas strains with extensive studies reported in the literature.

Download Full-text

Amycolatopsis Camponoti Sp. Nov., New Tetracenomycin-Producing Actinomycete Isolated from Carpenter Ant Camponotus Vagus

10.21203/rs.3.rs-1081009/v1 ◽

2021 ◽

Author(s):

Yuliya V Zakalyukina ◽

Ilya A Osterman ◽

Jacqueline Wolf ◽

Meina Neumann-Schaal ◽

Imen Nouioui ◽

...

Keyword(s):

Dna Hybridization ◽

Novel Species ◽

Phylogenetic Analyses ◽

Bacterial Species ◽

Morphological Characteristics ◽

Rrna Gene ◽

Species Demarcation ◽

Genome Sequences ◽

Cellular Fatty Acids ◽

Carpenter Ant

Abstract An actinobacterial strain A23T, isolated from adult ant Camponotus vagus collected in Ryazan region (Russia) and established as tetracenomycin X producer, was subjected to a polyphasic taxonomic study. Morphological characteristics of this strain included well-branched substrate mycelium and aerial hyphae fragmented into rod-shaped elements. Phylogenetic analyses based on 16S rRNA gene and genome sequences showed that strain A23T was most closely related to Amycolatopsis pretoriensis DSM 44654T (99.9%). Average nucleotide identity and digital DNA–DNA hybridization values between the genome sequences of isolate A23T and its closest relative, Amycolatopsis pretoriensis DSM 44654T, were 39.5% and 88.6%, which were below the 70% and 95-96% cut-off point recommended for bacterial species demarcation, respectively. The genome size of the isolate A23T is 10,560,374 bp with a DNA G+C content of 71.2 mol%. The whole-organism hydrolysates contain arabinose and galactose as main diagnostic sugars as well as ribose and rhamnose. It contained MK-9(H4) as the predominant menaquinone and iso-C16:0, iso-C15:0, anteiso-C17:0 and C16:0 as the major cellular fatty acids. Based on the phenotypic, genomic and phylogenetic data, isolate A23T represents a novel species of the genus Amycolatopsis, for which the name Amycolatopsis camponoti sp. nov. is proposed, and the type strain is A23T (=DSM 111725T =VKM 2882T).

Download Full-text

Genomic Characterization Provides an Insight into the Pathogenicity of the Poplar Canker Bacterium Lonsdalea populi

Genes ◽

10.3390/genes12020246 ◽

2021 ◽

Vol 12 (2) ◽

pp. 246

Author(s):

Xiaomeng Chen ◽

Rui Li ◽

Yonglin Wang ◽

Aining Li

Keyword(s):

Genome Sequence ◽

Extracellular Enzymes ◽

De Novo ◽

Whole Genome Sequence ◽

Hybrid Poplars ◽

A Genome ◽

Conserved Genes ◽

Genomic Characterization ◽

Molecular Bases ◽

Insight Into

An emerging poplar canker caused by the gram-negative bacterium, Lonsdalea populi, has led to high mortality of hybrid poplars Populus × euramericana in China and Europe. The molecular bases of pathogenicity and bark adaptation of L. populi have become a focus of recent research. This study revealed the whole genome sequence and identified putative virulence factors of L. populi. A high-quality L. populi genome sequence was assembled de novo, with a genome size of 3,859,707 bp, containing approximately 3434 genes and 107 RNAs (75 tRNA, 22 rRNA, and 10 ncRNA). The L. populi genome contained 380 virulence-associated genes, mainly encoding for adhesion, extracellular enzymes, secretory systems, and two-component transduction systems. The genome had 110 carbohydrate-active enzyme (CAZy)-coding genes and putative secreted proteins. The antibiotic-resistance database annotation listed that L. populi was resistant to penicillin, fluoroquinolone, and kasugamycin. Analysis of comparative genomics found that L. populi exhibited the highest homology with the L. britannica genome and L. populi encompassed 1905 specific genes, 1769 dispensable genes, and 1381 conserved genes, suggesting high evolutionary diversity and genomic plasticity. Moreover, the pan genome analysis revealed that the N-5-1 genome is an open genome. These findings provide important resources for understanding the molecular basis of the pathogenicity and biology of L. populi and the poplar-bacterium interaction.

Download Full-text

Aggregatimonas Sangjinii Gen. Nov., Sp. Nov., A Novel Silver Nanoparticle Synthesizing Bacterium Belonging to the Family Flavobacteriaceaee

10.21203/rs.3.rs-429103/v1 ◽

2021 ◽

Author(s):

Dawoon Chung ◽

Jaoon Young Hwan Kim ◽

Kyung Woo Kim ◽

Yong Min Kwon

Keyword(s):

16S Rrna ◽

Genome Sequence ◽

Sequence Similarity ◽

Whole Genome Sequence ◽

Rrna Gene ◽

Aerobic Bacterium ◽

16S Rrna Sequence ◽

Respiratory Quinone ◽

The Family ◽

The Media

Abstract A gram-negative, orange-pigmented, non-flagellated, gliding, rod-shaped, and aerobic bacterium, designated strain F202Z8T, was isolated from a rusty iron plate found in the intertidal region of Taean, South Korea. Notably, this strain synthesized silver nanoparticles (AgNPs), and 17 putative genes responsible for the synthesis of AgNPs were found in its genome. The complete genome sequence of strain F202Z8T is 4,723,614 bp, with 43.26% G + C content. Phylogenetic analysis based on 16S rRNA gene sequence revealed that strain F202Z8T forms a distinct lineage with closely related genera Maribacter, Pelagihabitans, Pseudozobellia, Zobellia, Pricia, and Costertonia belonging to the family Flavobacteriaceae. The 16S rRNA sequence similarity was < 94.5%. The digital DNA–DNA hybridization and average nucleotide identity values calculated from the whole genome-sequence comparison between strain F202Z8T and other members of the family Flavobacteriaceae were in the ranges of 12.7–16.9% and 70.3–74.4%, respectively. Growth was observed at 15–33°C (optimally at 30°C), at pH 6.5–7.5 (optimally at pH 7.0), and with the addition of 2.5–4.5% (w/v) NaCl to the media (optimally at 4.0%). The predominant cellular fatty acids were iso-C15: 0, iso-C15 :1 G, and iso-C17 :0 3-OH; the major respiratory quinone was MK-6. Polar lipids included phosphatidylethanolamine, five unidentified lipids, and two unidentified aminolipids. Our polyphasic taxonomic results suggested that this strain represents a novel species of a novel genus in the family Flavobacteriaceae, for which the name Aggregatimonas sangjinii gen. nov., sp. nov. is proposed. The type strain of Aggregatimonas sangjinii is F202Z8T (= KCCM 43411T = LMG 31494T).

Download Full-text

Analysis of the complete genome sequence of a marine-derived strainStreptomycessp. S063 CGMCC 14582 reveals its biosynthetic potential to produce novel anti-complement agents and peptides

PeerJ ◽

10.7717/peerj.6122 ◽

2019 ◽

Vol 7 ◽

pp. e6122 ◽

Cited By ~ 2

Author(s):

Liang-Yu Chen ◽

Hao-Tian Cui ◽

Chun Su ◽

Feng-Wu Bai ◽

Xin-Qing Zhao

Keyword(s):

Bioactive Compounds ◽

Genome Sequence ◽

Complete Genome Sequence ◽

Dna Hybridization ◽

Complete Genome ◽

System Analysis ◽

Genome Mining ◽

Biosynthetic Gene Cluster ◽

Genome Comparison ◽

Genome Sequences

Genome sequences of marine streptomycetes are valuable for the discovery of useful enzymes and bioactive compounds by genome mining. However, publicly available complete genome sequences of marine streptomycetes are still limited. Here, we present the complete genome sequence of a marine streptomyceteStreptomycessp. S063 CGMCC 14582. Species delineation based on the pairwise digital DNA-DNA hybridization and genome comparison ANI (average nucleotide identity) value showed thatStreptomycessp. S063 CGMCC 14582 possesses a unique genome that is clearly different from all of the other available genomes. Bioactivity tests showed thatStreptomycessp. S063 CGMCC 14582 produces metabolites with anti-complement activities, which are useful for treatment of numerous diseases that arise from inappropriate activation of the human complement system. Analysis of the genome reveals no biosynthetic gene cluster (BGC) which shows even low similarity to that of the known anti-complement agents was detected in the genome, indicating thatStreptomycessp. S063 CGMCC 14582 may produce novel anti-complement agents of microbial origin. Four BGCs which are potentially involved in biosynthesis of non-ribosomal peptides were disrupted, but no decrease of anti-complement activities was observed, suggesting that these four BGCs are not involved in biosynthesis of the anti-complement agents. In addition, LC-MS/MS analysis and subsequent alignment through the Global Natural Products Social Molecular Networking (GNPS) platform led to the detection of novel peptides produced by the strain.Streptomycessp. S063 CGMCC 14582 grows rapidly and is salt tolerant, which benefits efficient secondary metabolite production via seawater-based fermentation. Our results indicate thatStreptomycessp. S063 has great potential to produce novel bioactive compounds, and also is a good host for heterologous production of useful secondary metabolites for drug discovery.

Download Full-text

Whole-proteome tree of life suggests a deep burst of organism diversity

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1915766117 ◽

2020 ◽

Vol 117 (7) ◽

pp. 3678-3686 ◽

Cited By ~ 5

Author(s):

JaeJin Choi ◽

Sung-Hou Kim

Keyword(s):

Information Theory ◽

Genome Sequence ◽

Tree Of Life ◽

Whole Genome Sequence ◽

Whole Genome ◽

Genome Sequences ◽

Alignment Free ◽

Whole Transcriptome ◽

Evolutionary Progression ◽

Feature Frequency

An organism tree of life (organism ToL) is a conceptual and metaphorical tree to capture a simplified narrative of the evolutionary course and kinship among the extant organisms. Such a tree cannot be experimentally validated but may be reconstructed based on characteristics associated with the organisms. Since the whole-genome sequence of an organism is, at present, the most comprehensive descriptor of the organism, a whole-genome sequence-based ToL can be an empirically derivable surrogate for the organism ToL. However, experimentally determining the whole-genome sequences of many diverse organisms was practically impossible until recently. We have constructed three types of ToLs for diversely sampled organisms using the sequences of whole genome, of whole transcriptome, and of whole proteome. Of the three, whole-proteome sequence-based ToL (whole-proteome ToL), constructed by applying information theory-based feature frequency profile method, an “alignment-free” method, gave the most topologically stable ToL. Here, we describe the main features of a whole-proteome ToL for 4,023 species with known complete or almost complete genome sequences on grouping and kinship among the groups at deep evolutionary levels. The ToL reveals 1) all extant organisms of this study can be grouped into 2 “Supergroups,” 6 “Major Groups,” or 35+ “Groups”; 2) the order of emergence of the “founders” of all of the groups may be assigned on an evolutionary progression scale; 3) all of the founders of the groups have emerged in a “deep burst” at the very beginning period near the root of the ToL—an explosive birth of life’s diversity.

Download Full-text

Antimicrobial resistance genetic factor identification from whole-genome sequence data using deep feature selection

BMC Bioinformatics ◽

10.1186/s12859-019-3054-4 ◽

2019 ◽

Vol 20 (S15) ◽

Cited By ~ 2

Author(s):

Jinhong Shi ◽

Yan Yan ◽

Matthew G. Links ◽

Longhai Li ◽

Jo-Anne R. Dillon ◽

...

Keyword(s):

Feature Selection ◽

Antimicrobial Resistance ◽

Genome Sequence ◽

Sequence Data ◽

Bacterial Species ◽

Clinical Diagnostics ◽

New Drugs ◽

Whole Genome Sequence ◽

Whole Genome ◽

Genome Sequence Data

Abstract Background Antimicrobial resistance (AMR) is a major threat to global public health because it makes standard treatments ineffective and contributes to the spread of infections. It is important to understand AMR’s biological mechanisms for the development of new drugs and more rapid and accurate clinical diagnostics. The increasing availability of whole-genome SNP (single nucleotide polymorphism) information, obtained from whole-genome sequence data, along with AMR profiles provides an opportunity to use feature selection in machine learning to find AMR-associated mutations. This work describes the use of a supervised feature selection approach using deep neural networks to detect AMR-associated genetic factors from whole-genome SNP data. Results The proposed method, DNP-AAP (deep neural pursuit – average activation potential), was tested on a Neisseria gonorrhoeae dataset with paired whole-genome sequence data and resistance profiles to five commonly used antibiotics including penicillin, tetracycline, azithromycin, ciprofloxacin, and cefixime. The results show that DNP-AAP can effectively identify known AMR-associated genes in N. gonorrhoeae, and also provide a list of candidate genomic features (SNPs) that might lead to the discovery of novel AMR determinants. Logistic regression classifiers were built with the identified SNPs and the prediction AUCs (area under the curve) for penicillin, tetracycline, azithromycin, ciprofloxacin, and cefixime were 0.974, 0.969, 0.949, 0.994, and 0.976, respectively. Conclusions DNP-AAP can effectively identify known AMR-associated genes in N. gonorrhoeae. It also provides a list of candidate genes and intergenic regions that might lead to novel AMR factor discovery. More generally, DNP-AAP can be applied to AMR analysis of any bacterial species with genomic variants and phenotype data. It can serve as a useful screening tool for microbiologists to generate genetic candidates for further lab experiments.

Download Full-text

Genome Sequence of the Psychrophilic Bacterium Tenacibaculum ovolyticum Strain da5A-8 Isolated from Deep Seawater

Genome Announcements ◽

10.1128/genomea.00644-16 ◽

2016 ◽

Vol 4 (3) ◽

Cited By ~ 6

Author(s):

Maki Teramoto ◽

Zhenyu Zhai ◽

Ayumi Komatsu ◽

Keigo Shibayama ◽

Masato Suzuki

Keyword(s):

Genome Sequence ◽

Gene Sequence ◽

Virulence Genes ◽

Bacterial Species ◽

Rrna Gene ◽

Psychrophilic Bacterium ◽

Fish Pathogen ◽

Deep Seawater ◽

Fish Pathogens ◽

The 16S Rrna Gene

Some bacterial species of the genus Tenacibaculum , including Tenacibaculum ovolyticum , have been known as fish pathogens in the sea. So far, the only published genome sequence for this genus is for Tenacibaculum dicentrarchi , which could also be a fish pathogen. Strain da5A-8, showing 100% identity to the 16S rRNA gene sequence of T. ovolyticum DSM 18103 T , was isolated from seawater at a depth of 344 m in Kochi, Japan, and grew optimally at 10 to 20°C. The genome sequence of strain da5A-8 revealed the possible virulence genes commonly observed in the genus Tenacibaculum .

Download Full-text

Whole-Genome Sequence and Classification of 11 Endophytic Bacteria from Poison Ivy ( Toxicodendron radicans ): TABLE 1.

Genome Announcements ◽

10.1128/genomea.01319-15 ◽

2015 ◽

Vol 3 (6) ◽

Cited By ~ 5

Author(s):

Phuong N. Tran ◽

Nicholas E. H. Tan ◽

Yin Peng Lee ◽

Han Ming Gan ◽

Steven J. Polter ◽

...

Keyword(s):

Genome Sequence ◽

Endophytic Bacteria ◽

Whole Genome Sequence ◽

Whole Genome ◽

Genome Sequences ◽

Poison Ivy

Here, we report the whole-genome sequences and annotation of 11 endophytic bacteria from poison ivy ( Toxicodendron radicans ) vine tissue. Five bacteria belong to the genus Pseudomonas , and six single members from other genera were found present in interior vine tissue of poison ivy.

Download Full-text

Mucilaginibacter polysacchareus sp. nov., an exopolysaccharide-producing bacterial species isolated from the rhizoplane of the herb Angelica sinensis

INTERNATIONAL JOURNAL OF SYSTEMATIC AND EVOLUTIONARY MICROBIOLOGY ◽

10.1099/ijs.0.029793-0 ◽

2012 ◽

Vol 62 (Pt_3) ◽

pp. 632-637 ◽

Cited By ~ 29

Author(s):

Song-Ih Han ◽

Hyo-Jin Lee ◽

Hae-Ran Lee ◽

Ki-Kwang Kim ◽

Kyung-Sook Whang

Keyword(s):

16S Rrna ◽

16S Rrna Gene ◽

Dna Hybridization ◽

Gene Sequence ◽

Novel Species ◽

Bacterial Species ◽

Angelica Sinensis ◽

Rrna Gene ◽

The 16S Rrna Gene ◽

Sequence Similarities

Three exopolysaccharide-producing bacteria, designated strains DRP28T, DRP29 and DRP31, were isolated from the rhizoplane of Angelica sinensis from the Geumsan, Republic of Korea. Cells were straight rods, Gram reaction-negative, aerobic, non-motile, and catalase- and oxidase- positive. Flexirubin-type pigments were absent. Phylogenetic analysis of the 16S rRNA gene indicated that these bacteria belong to the genus Mucilaginibacter in the phylum Bacteroidetes. 16S rRNA gene sequence similarities to strains of recognized species of the genus Mucilaginibacter were 93.8–97.4 %. The major fatty acids were iso-C15 : 0 and summed feature 3 (C16 : 1ω7c and/or iso-C15 : 0 2-OH). The strains contained MK-7 as the major isoprenoid quinone. Strains DRP28T, DRP29 and DRP31 formed a single, distinct genomospecies with DNA G+C contents of 41.9–42.7 mol% and DNA hybridization values of 82.6–86.8 %; the strains exhibited DNA–DNA hybridization values of only 20.4–41.3 % with related species of the genus Mucilaginibacter. On the basis of evidence presented in this study, strains DRP28T, DRP29 and DRP31 were considered to represent a novel species of the genus Mucilaginibacter, for which the name Mucilaginibacter polysacchareus sp. nov. is proposed. The type strain is DRP28T ( = KACC 15075T = NBRC 107757T).

Download Full-text