scholarly journals Modules of co-occurrence in the cyanobacterial pan-genome

2017 ◽  
Author(s):  
Christian Beck ◽  
Henning Knoop ◽  
Ralf Steuer

ABSTRACTThe increasing availability of fully sequenced cyanobacterial genomes opens unprecedented opportunities to investigate the manifold adaptations and functional relationships that determine the genetic content of individual bacterial species. Here, we use comparative genome analysis to investigate the cyanobacterial pan-genome based on 77 strains whose complete genome sequence is available. Our focus is the co-occurrence of likely ortholog genes, denoted as CLOGs. We conjecture that co-occurrence CLOGs is indicative of functional relationships between the respective genes. Going beyond the analysis of pair-wise co-occurrences, we introduce a novel network approach to identify modules of co-occurring ortholog genes. Our results demonstrate that these modules exhibit a high degree of functional coherence and reveal known as well as previously unknown functional relationships. We argue that the high functional coherence observed for the extracted modules is a consequence of the similar-yet-diverse nature of the cyanobacterial phylum. We provide a simple toolbox that facilitates further analysis of our results with respect to specific cyanobacterial genes of interest.

2011 ◽  
Vol 86 (3) ◽  
pp. 1844-1852 ◽  
Author(s):  
A. Cornelissen ◽  
S. C. Hardies ◽  
O. V. Shaburova ◽  
V. N. Krylov ◽  
W. Mattheus ◽  
...  

2020 ◽  
Vol 14 ◽  
pp. 117793222093806
Author(s):  
Sávio Souza Costa ◽  
Luís Carlos Guimarães ◽  
Artur Silva ◽  
Siomar Castro Soares ◽  
Rafael Azevedo Baraúna

Pan-genome is defined as the set of orthologous and unique genes of a specific group of organisms. The pan-genome is composed by the core genome, accessory genome, and species- or strain-specific genes. The pan-genome is considered open or closed based on the alpha value of the Heap law. In an open pan-genome, the number of gene families will continuously increase with the addition of new genomes to the analysis, while in a closed pan-genome, the number of gene families will not increase considerably. The first step of a pan-genome analysis is the homogenization of genome annotation. The same software should be used to annotate genomes, such as GeneMark or RAST. Subsequently, several software are used to calculate the pan-genome such as BPGA, GET_HOMOLOGUES, PGAP, among others. This review presents all these initial steps for those who want to perform a pan-genome analysis, explaining key concepts of the area. Furthermore, we present the pan-genomic analysis of 9 bacterial species. These are the species with the highest number of genomes deposited in GenBank. We also show the influence of the identity and coverage parameters on the prediction of orthologous and paralogous genes. Finally, we cite the perspectives of several research areas where pan-genome analysis can be used to answer important issues.


2015 ◽  
Vol 37 (11) ◽  
pp. 959-968
Author(s):  
Jung Soo Seo ◽  
Mun Gyeong Kwon ◽  
Jee Youn Hwang ◽  
Sung Hee Jung ◽  
Hyun Ja Han ◽  
...  

PeerJ ◽  
2021 ◽  
Vol 9 ◽  
pp. e10185
Author(s):  
Romen Singh Naorem ◽  
Jochen Blom ◽  
Csaba Fekete

Staphylococcus aureus is a drug-resistant pathogen, capable of colonizing diverse ecological niches and causing a broad spectrum of infections related to a community and healthcare. In this study, we choose four methicillin-resistant S. aureus (MRSA) clinical isolates from Germany and Hungary based on our previous polyphasic characterization finding. We assumed that the selected strains have a different genetic background in terms of the presence of resistance and virulence genes, prophages, plasmids, and secondary metabolite biosynthesis genes that may play a crucial role in niche adaptation and pathogenesis. To clarify these assumptions, we performed a comparative genome analysis of these strains and observed many differences in their genomic compositions. The Hungarian isolates (SA H27 and SA H32) with ST22-SCCmec type IVa have fewer genes for multiple-drug resistance, virulence, and prophages reported in Germany isolates. Germany isolate, SA G6 acquires aminoglycoside (ant(6)-Ia and aph(3’)-III) and nucleoside (sat-4) resistance genes via phage transduction and may determine its pathogenic potential. The comparative genome study allowed the segregation of isolates of geographical origin and differentiation of the clinical isolates from the commensal isolates. This study suggested that Germany and Hungarian isolates are genetically diverse and showing variation among them due to the gain or loss of mobile genetic elements (MGEs). An interesting finding is the addition of SA G6 genome responsible for the drastic decline of the core/pan-genome ratio curve and causing the pan-genome to open wider. Functional characterizations revealed that S. aureus isolates survival are maintained by the amino acids catabolism and favor adaptation to growing in a protein-rich medium. The dispersible and singleton genes content of S. aureus genomes allows us to understand the genetic variation among the CC5 and CC22 groups. The strains with the same genetic background were clustered together, which suggests that these strains are highly alike; however, comparative genome analysis exposed that the acquisition of phage elements, and plasmids through the events of MGEs transfer contribute to differences in their phenotypic characters. This comparative genome analysis would improve the knowledge about the pathogenic S. aureus strain’s characterization, and responsible for clinically important phenotypic differences among the S. aureus strains.


2020 ◽  
Vol 7 (2) ◽  
pp. 74
Author(s):  
Longsheng Yang ◽  
Yongwei Zhu ◽  
Zhong Peng ◽  
Yi Ding ◽  
Kai Jie ◽  
...  

Erysipelothrix rhusiopathiae is a common pathogen responsible for pig erysipelas. However, the molecular basis for the pathogenesis of E. rhusiopathiae remains to be elucidated. In this study, the complete genome sequence of the E. rhusiopathiae strain WH13013, a pathogenic isolate from a diseased pig, was generated using a combined strategy of PacBio RSII and Illumina sequencing technologies. The strategy finally generated a single circular chromosome of approximately 1.78 Mb in size for the complete genome of WH13013, with an average GC content of 36.49%. The genome of WH13013 encoded 1633 predicted proteins, 55 tRNAs, as well as 15 rRNAs. It contained four genomic islands and several resistance-associated genes were identified within these islands. Phylogenetic analysis revealed that WH13013 was close to many other sequenced E. rhusiopathiae virulent strains. The comprehensive comparative analysis of eight E. rhusiopathiae virulent strains, including WH13013, identified a total of 1184 core genes. A large proportion (approximately 75.31%) of these core genes participated in nutrition and energy uptake and metabolism as well as the other bioactivities that are necessary for bacterial survival and adaption. The core genes also contained those encoding proteins participating in the biosynthesis and/or the components of the proposed virulence factors of E. rhusiopathiae, including the capsule (cpsA, cpsB, cpsC), neuraminidase (nanH), hyaluronidase (hylA, hylB, hylC), and surface proteins (spaA, rspA, rspB). The obtaining of the complete genome sequence of this virulent strain, WH13013, and this comprehensive comparative genome analysis will help in further studies of the genetic basis of the pathogenesis of E. rhusiopathiae.


Sign in / Sign up

Export Citation Format

Share Document