Network analysis of ten thousand genomes shed light on Pseudomonas diversity and classification
The growth of sequenced bacterial genomes has revolutionized the assessment of microbial diversity. Pseudomonas is a widely diverse genus, comprising isolates associated with processes from pathogenesis to biotechnological applications. However, this high diversity led to historical taxonomic inconsistencies. Although type strains have been employed to estimate Pseudomonas diversity, they represent a small fraction of the genomic diversity at a genus level. We used 10,035 available Pseudomonas genomes, including 210 type strains, to build a genomic distance network to estimate the number of species through community identification. We identified inconsistencies with several type strains and found that 25.65% of the Pseudomonas genomes deposited on Genbank are misclassified. We retrieved the 13 main Pseudomonas groups and proposed P. alcaligenes as a new group. Finally, this work provides new insights on the phylogenetic boundaries of Pseudomonas and highlights that the Pseudomonas diversity has been hitherto overlooked.