Analysis of the core genome and pangenome of Clostridium butyricum

Genome ◽  
2021 ◽  
Vol 64 (1) ◽  
pp. 51-61
Author(s):  
Wei Zou ◽  
Guangbin Ye ◽  
Kaizheng Zhang ◽  
Haiquan Yang ◽  
Jiangang Yang

Clostridium butyricum is an anaerobic bacterium that inhabits broad niches. Clostridium butyricum is known for its production of butyrate, 1,3-propanediol, and hydrogen. This study aimed to present a comparative pangenome analysis of 24 strains isolated from different niches. We sequenced and annotated the genome of C. butyricum 3-3 isolated from the Chinese baijiu ecosystem. The pangenome of C. butyricum was open. The core genome, accessory genome, and strain-specific genes comprised 1011, 4543, and 1473 genes, respectively. In the core genome, Carbohydrate metabolism was the largest category, and genes in the biosynthetic pathway of butyrate and glycerol metabolism were conserved (in the core or soft-core genome). Furthermore, the 1,3-propanediol operon existed in 20 strains. In the accessory genome, numerous mobile genetic elements belonging to the Replication, recombination, and repair (L) category were identified. In addition, genome islands were identified in all 24 strains, ranging from 2 (strain KNU-L09) to 53 (strain SU1), and phage sequences were found in 17 of the 24 strains. This study provides an important genomic framework that could pave the way for the exploration of C. butyricum and future studies on the genetic diversification of C. butyricum.

2021 ◽  
Vol 7 (9) ◽  
Author(s):  
Rebecca J. Hall ◽  
Fiona J. Whelan ◽  
Elizabeth A. Cummins ◽  
Christopher Connor ◽  
Alan McNally ◽  
...  

The pangenome contains all genes encoded by a species, with the core genome present in all strains and the accessory genome in only a subset. Coincident gene relationships are expected within the accessory genome, where the presence or absence of one gene is influenced by the presence or absence of another. Here, we analysed the accessory genome of an Escherichia coli pangenome consisting of 400 genomes from 20 sequence types to identify genes that display significant co-occurrence or avoidance patterns with one another. We present a complex network of genes that are either found together or that avoid one another more often than would be expected by chance, and show that these relationships vary by lineage. We demonstrate that genes co-occur by function, and that several highly connected gene relationships are linked to mobile genetic elements. We find that genes are more likely to co-occur with, rather than avoid, another gene in the accessory genome. This work furthers our understanding of the dynamic nature of prokaryote pangenomes and implicates both function and mobility as drivers of gene relationships.


2021 ◽  
Author(s):  
Rebecca J Hall ◽  
Fiona J Whelan ◽  
Elizabeth A Cummins ◽  
Christopher Connor ◽  
Alan McNally ◽  
...  

The pangenome contains all genes encoded by a species, with the core genome present in all strains and the accessory genome in only a subset. Coincident gene relationships are expected within the accessory genome, where the presence or absence of one gene is influenced by the presence or absence of another. Here, we analysed the accessory genome of an Escherichia coli pangenome consisting of 400 genomes from 20 sequence types to identify genes that display significant co-occurrence or avoidance patterns with one another. We present a complex network of genes that are either found together or that avoid one another more often than would be expected by chance, and show that these relationships vary by lineage. We demonstrate that genes co-occur by function, and that several highly connected gene relationships are linked to mobile genetic elements. We find that genes are more likely to co-occur with, rather than avoid, another gene, suggesting that cooperation is more common than conflict in the accessory genome. This work furthers our understanding of the dynamic nature of prokaryote pangenomes and implicates both function and mobility as drivers of gene relationships.


Author(s):  
Jorge A. Moura de Sousa ◽  
Eduardo P. C. Rocha

Bacteriophages (phages) are bacterial parasites that can themselves be parasitized by phage satellites. The molecular mechanisms used by satellites to hijack phages are sometimes understood in great detail, but the origins, abundance, distribution and composition of these elements are poorly known. Here, we show that P4-like elements are present in more than 30% of the genomes of Enterobacterales, and in almost half of those of Escherichia coli , sometimes in multiple distinct copies. We identified over 1000 P4-like elements with very conserved genetic organization of the core genome and a few hotspots with highly variable genes. These elements are never found in plasmids and have very little homology to known phages, suggesting an independent evolutionary origin. Instead, they are scattered across chromosomes, possibly because their integrases are often exchanged with other elements. The rooted phylogenies of hijacking functions are correlated and suggest longstanding coevolution. They also reveal broad host ranges in P4-like elements, as almost identical elements can be found in distinct bacterial genera. Our results show that P4-like phage satellites constitute a very distinct, widespread and ancient family of mobile genetic elements. They pave the way for studying the molecular evolution of antagonistic interactions between phages and their satellites. This article is part of the theme issue ‘The secret lives of microbial mobile genetic elements’.


2020 ◽  
Vol 44 (6) ◽  
pp. 740-762
Author(s):  
Changhan Lee ◽  
Jens Klockgether ◽  
Sebastian Fischer ◽  
Janja Trcek ◽  
Burkhard Tümmler ◽  
...  

ABSTRACT The environmental species Pseudomonas aeruginosa thrives in a variety of habitats. Within the epidemic population structure of P. aeruginosa, occassionally highly successful clones that are equally capable to succeed in the environment and the human host arise. Framed by a highly conserved core genome, individual members of successful clones are characterized by a high variability in their accessory genome. The abundance of successful clones might be funded in specific features of the core genome or, although not mutually exclusive, in the variability of the accessory genome. In clone C, one of the most predominant clones, the plasmid pKLC102 and the PACGI-1 genomic island are two ubiquitous accessory genetic elements. The conserved transmissible locus of protein quality control (TLPQC) at the border of PACGI-1 is a unique horizontally transferred compository element, which codes predominantly for stress-related cargo gene products such as involved in protein homeostasis. As a hallmark, most TLPQC xenologues possess a core genome equivalent. With elevated temperature tolerance as a characteristic of clone C strains, the unique P. aeruginosa and clone C specific disaggregase ClpG is a major contributor to tolerance. As other successful clones, such as PA14, do not encode the TLPQC locus, ubiquitous denominators of success, if existing, need to be identified.


2021 ◽  
Author(s):  
Jorge Moura de Sousa ◽  
Eduardo P. C. Rocha

Bacteriophages (phages) are bacterial parasites that can themselves be parasitized by phage satellites. The molecular mechanisms used by satellites to hijack phages are sometimes understood in great detail, but the origins, abundance, distribution, and composition of these elements are poorly known. Here, we show that P4-like elements are present in more than 10% of the genomes of Enterobacterales, and in almost half of those of Escherichia coli, sometimes in multiple distinct copies. We identified over 1000 P4-like elements with very conserved genetic organization of the core genome and a few hotspots with highly variable genes. These elements are never found in plasmids and have very little homology to known phages, suggesting an independent evolutionary origin. Instead, they are scattered across chromosomes, possibly because their integrases are often exchanged with other elements. The rooted phylogenies of hijacking functions are correlated and suggest longstanding co-evolution. They also reveal broad host ranges in P4-like elements, since almost identical elements can be found in distinct bacterial genuses. Our results show that P4-like phage satellites constitute a very distinct, widespread and ancient family of mobile genetic elements. They pave the way for studying the molecular evolution of antagonistic interactions between phages and their satellites.


2017 ◽  
Author(s):  
Khalil Abudahab ◽  
Joaquín M. Prada ◽  
Zhirong Yang ◽  
Stephen D. Bentley ◽  
Nicholas J. Croucher ◽  
...  

ABSTRACTThe standard workhorse for genomic analysis of the evolution of bacterial populations is phylogenetic modelling of mutations in the core genome. However, in the current era of population genomics, a notable amount of information about evolutionary and transmission processes in diverse populations can be lost unless the accessory genome is also taken into consideration. Here we introduce PANINI, a computationally scalable method for identifying the neighbours for each isolate in a data set using unsupervised machine learning with stochastic neighbour embedding. PANINI is browser-based and integrates with the Microreact platform for rapid online visualisation and exploration of both core and accessory genome evolutionary signals together with relevant epidemiological, geographic, temporal and other metadata. Several case studies with single-and multi-clone pneumococcal populations are presented to demonstrate ability to identify biologically important signals from gene content data. PANINI is available at http://panini.wgsa.net/ and code at http://gitlab.com/cgps/panini


2021 ◽  
Author(s):  
Guillermo Uceda-Campos ◽  
Oseias R. Feitosa-Junior ◽  
Caio R.N. Santiago ◽  
Paulo M. Pierry ◽  
Paulo A. Zaini ◽  
...  

The Gram-negative bacterium Xylella fastidiosa colonizes plant xylem vessels and is obligately vectored by xylem sap-feeding hemipteran insects. X. fastidiosa causes diseases in many plant species but in a variety of its plant hosts this bacterium behaves as a commensal endophyte. Originally confined to the Americas, infecting mainly grapevine, citrus and coffee plants, X. fastidiosa has spread to several plant species in Europe, causing devastating crop diseases. Although many pathogenicity and virulence factors have been identified in X. fastidiosa which enable the bacterium to successfully establish in the xylem tissue, the mechanisms by which distinct X. fastidiosa strains colonize and cause disease in specific plant hosts have not been fully elucidated. Here we present comparative analyses of 94 publicly available whole-genome sequences of X. fastidiosa strains with the goal of providing insights into plant host specificity determinants for this phytopathogen as well as of expanding the knowledge of its mobile genetic elements (MGE) content, mainly prophages. Our results revealed a pangenome of 4,549 protein coding sequences (CDSs) which is still open. The core- and accessory genomes comprise 954 and 2,219 CDSs, respectively. Phylogenetic tree construction using all core genome CDSs grouped the strains in three major clades of subspecies fastidiosa, multiplex and pauca, with subclades related to the strains sequence type (ST) obtained from multi-locus sequence typing (MLST). The geographic region where the strains were collected showed stronger association with the clades of X. fastidiosa strains rather than the plant species from which they were isolated. Among the CDSs related to virulence and pathogenicity found in the core genome, those related to lipopolysaccharide (LPS) synthesis and trimeric autotransporter adhesins (TAA) are somewhat related with the plant host of a given strain according to phylogenetic inference. The X. fastidiosa accessory genome is represented by an abundant and heterogeneous mobilome, which includes a diversity of prophage regions. In summary, the genome comparisons reported here will enable a better understanding of the diversity of phylogenetically close genomes and warrant further investigation of LPS and TAAs as potential X. fastidiosa host-specificity determinants.


2021 ◽  
Vol 7 (4) ◽  
pp. 277
Author(s):  
Danny Haelewaters ◽  
Hector Urbina ◽  
Samuel Brown ◽  
Shannon Newerth-Henson ◽  
M. Catherine Aime

Romaine lettuce (Lactuca sativa) is an important staple of American agriculture. Unlike many vegetables, romaine lettuce is typically consumed raw. Phylloplane microbes occur naturally on plant leaves; consumption of uncooked leaves includes consumption of phylloplane microbes. Despite this fact, the microbes that naturally occur on produce such as romaine lettuce are for the most part uncharacterized. In this study, we conducted culture-based studies of the fungal romaine lettuce phylloplane community from organic and conventionally grown samples. In addition to an enumeration of all such microbes, we define and provide a discussion of the genera that form the “core” romaine lettuce mycobiome, which represent 85.5% of all obtained isolates: Alternaria, Aureobasidium, Cladosporium, Filobasidium, Naganishia, Papiliotrema, Rhodotorula, Sampaiozyma, Sporobolomyces, Symmetrospora and Vishniacozyma. We highlight the need for additional mycological expertise in that 23% of species in these core genera appear to be new to science and resolve some taxonomic issues we encountered during our work with new combinations for Aureobasidiumbupleuri and Curvibasidium nothofagi. Finally, our work lays the ground for future studies that seek to understand the effect these communities may have on preventing or facilitating establishment of exogenous microbes, such as food spoilage microbes and plant or human pathogens.


2008 ◽  
Vol 191 (1) ◽  
pp. 91-99 ◽  
Author(s):  
Marc Deloger ◽  
Meriem El Karoui ◽  
Marie-Agnès Petit

ABSTRACT The fundamental unit of biological diversity is the species. However, a remarkable extent of intraspecies diversity in bacteria was discovered by genome sequencing, and it reveals the need to develop clear criteria to group strains within a species. Two main types of analyses used to quantify intraspecies variation at the genome level are the average nucleotide identity (ANI), which detects the DNA conservation of the core genome, and the DNA content, which calculates the proportion of DNA shared by two genomes. Both estimates are based on BLAST alignments for the definition of DNA sequences common to the genome pair. Interestingly, however, results using these methods on intraspecies pairs are not well correlated. This prompted us to develop a genomic-distance index taking into account both criteria of diversity, which are based on DNA maximal unique matches (MUM) shared by two genomes. The values, called MUMi, for MUM index, correlate better with the ANI than with the DNA content. Moreover, the MUMi groups strains in a way that is congruent with routinely used multilocus sequence-typing trees, as well as with ANI-based trees. We used the MUMi to determine the relatedness of all available genome pairs at the species and genus levels. Our analysis reveals a certain consistency in the current notion of bacterial species, in that the bulk of intraspecies and intragenus values are clearly separable. It also confirms that some species are much more diverse than most. As the MUMi is fast to calculate, it offers the possibility of measuring genome distances on the whole database of available genomes.


Sign in / Sign up

Export Citation Format

Share Document