synteny blocks
Recently Published Documents


TOTAL DOCUMENTS

29
(FIVE YEARS 8)

H-INDEX

10
(FIVE YEARS 1)

2021 ◽  
Author(s):  
Adelme Bazin ◽  
Claudine Medigue ◽  
David Vallenet ◽  
Alexandra Calteau

The recent years have seen the rise of pangenomes as comparative genomic tools to better understand the evolution of gene content among microbial genomes in close phylogenetic groups such as species. While the core or persistent genome is often well-known as it includes essential or ubiquitous genes, the variable genome is usually less characterized and includes many genes with unknown functions even among the most studied organisms. It gathers important genes for strain adaptation that are acquired by horizontal gene transfer. Here, we introduce panModule, an original method to identify conserved modules in pangenome graphs built from thousands of microbial genomes. These modules correspond to synteny blocks composed of consecutive genes that are conserved in a subset of the compared strains. Identifying conserved modules can provide insights on genes involved in the same functional processes, and as such is a very helpful tool to facilitate the understanding of genomic regions with complex evolutionary histories. The panModule method was benchmarked on a curated dataset of conserved modules in Escherichia coli genomes. Its use was illustrated through a study of a high pathogenicity island in Klebsiella pneumoniae that allowed a better understanding of this region. panModule is freely available and accessible through the PPanGGOLiN software suite (https://github.com/labgem/PPanGGOLiN).


BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Alon Kafri ◽  
Benny Chor ◽  
David Horn

Abstract Background Inversion Symmetry is a generalization of the second Chargaff rule, stating that the count of a string of k nucleotides on a single chromosomal strand equals the count of its inverse (reverse-complement) k-mer. It holds for many species, both eukaryotes and prokaryotes, for ranges of k which may vary from 7 to 10 as chromosomal lengths vary from 2Mbp to 200 Mbp. Building on this formalism we introduce the concept of k-mer distances between chromosomes. We formulate two k-mer distance measures, D1 and D2, which depend on k. D1 takes into account all k-mers (for a single k) appearing on single strands of the two compared chromosomes, whereas D2 takes into account both strands of each chromosome. Both measures reflect dissimilarities in global chromosomal structures. Results After defining the various distance measures and summarizing their properties, we also define proximities that rely on the existence of synteny blocks between chromosomes of different bacterial strains. Comparing pairs of strains of bacteria, we find negative correlations between synteny proximities and k-mer distances, thus establishing the meaning of the latter as measures of evolutionary distances among bacterial strains. The synteny measures we use are appropriate for closely related bacterial strains, where considerable sections of chromosomes demonstrate high direct or reversed equality. These measures are not appropriate for comparing different bacteria or eukaryotes. K-mer structural distances can be defined for all species. Because of the arbitrariness of strand choices, we employ only the D2 measure when comparing chromosomes of different species. The results for comparisons of various eukaryotes display interesting behavior which is partially consistent with conventional understanding of evolutionary genomics. In particular, we define ratios of minimal k-mer distances (KDR) between unmasked and masked chromosomes of two species, which correlate with both short and long evolutionary scales. Conclusions k-mer distances reflect dissimilarities among global chromosomal structures. They carry information which aggregates all mutations. As such they can complement traditional evolution studies , which mainly concentrate on coding regions.


2021 ◽  
Author(s):  
Alexey Zabelkin ◽  
Yulia Yakovleva ◽  
Olga Bochkareva ◽  
Nikita Alexeev

Motivation: High plasticity of bacterial genomes is provided by numerous mechanisms including horizontal gene transfer and recombination via numerous flanking repeats. Genome rearrangements such as inversions, deletions, insertions, and duplications may independently occur in different strains, providing parallel adaptation. Specifically, such rearrangements might be responsible for multi-virulence, antibiotic resistance, and antigenic variation. However, identification of such events requires laborious manual inspection and verification of phyletic pattern consistency. Results: Here we define the term "parallel rearrangements" as events that occur independently in phylogenetically distant bacterial strains and present a formalization of the problem of parallel rearrangements calling. We implement an algorithmic solution for the identification of parallel rearrangements in bacterial population, as a tool PaReBrick. The tool takes synteny blocks and a phylogenetic tree as input and outputs rearrangement events. The tool tests each rearrangement for consistency with a tree, and sorts the events by their parallelism score and provides diagrams of the neighbors for each block of interest, allowing the detection of horizontally transferred blocks or their extra copies and the inversions in which copied blocks are involved. We proved PaReBrick's efficiency and accuracy and showed its potential to detect genome rearrangements responsible for pathogenicity and adaptation in bacterial genomes. Availability: PaReBrick is written in Python and is available on GitHub: https://github.com/ctlab/parallel-rearrangements .


BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Daniel Restrepo-Montoya ◽  
Phillip E. McClean ◽  
Juan M. Osorno

Abstract Background Legume species are an important plant model because of their protein-rich physiology. The adaptability and productivity of legumes are limited by major biotic and abiotic stresses. Responses to these stresses directly involve plasma membrane receptor proteins known as receptor-like kinases and receptor-like proteins. Evaluating the homology relations among RLK and RLP for seven legume species, and exploring their presence among synteny blocks allow an increased understanding of evolutionary relations, physical position, and chromosomal distribution in related species and their shared roles in stress responses. Results Typically, a high proportion of RLK and RLP legume proteins belong to orthologous clusters, which is confirmed in this study, where between 66 to 90% of the RLKs and RLPs per legume species were classified in orthologous clusters. One-third of the evaluated syntenic blocks had shared RLK/RLP genes among both legumes and non-legumes. Among the legumes, between 75 and 98% of the RLK/RLP were present in syntenic blocks. The distribution of chromosomal segments between Phaseolus vulgaris and Vigna unguiculata, two species that diverged ~ 8 mya, were highly similar. Among the RLK/RLP synteny clusters, seven experimentally validated resistance RLK/RLP genes were identified in syntenic blocks. The RLK resistant genes FLS2, BIR2, ERECTA, IOS1, and AtSERK1 from Arabidopsis and SLSERK1 from Solanum lycopersicum were present in different pairwise syntenic blocks among the legume species. Meanwhile, only the LYM1- RLP resistant gene from Arabidopsis shared a syntenic blocks with Glycine max. Conclusions The orthology analysis of the RLK and RLP suggests a dynamic evolution in the legume family, with between 66 to 85% of RLK and 83 to 88% of RLP belonging to orthologous clusters among the species evaluated. In fact, for the 10-species comparison, a lower number of singleton proteins were reported among RLP compared to RLK, suggesting that RLP positions are more physically conserved compared to RLK. The identification of RLK and RLP genes among the synteny blocks in legumes revealed multiple highly conserved syntenic blocks on multiple chromosomes. Additionally, the analysis suggests that P. vulgaris is an appropriate anchor species for comparative genomics among legumes.


2021 ◽  
Author(s):  
Alon Kafri ◽  
Benny Chor ◽  
David Horn

Abstract BackgroundInversion Symmetry is a generalization of the second Chargaff rule, stating that the count of a string of k nucleotides on a single chromosomal strand equals the count of its inverse (reverse-complement) k-mer. It holds for many species, both eukaryotes and prokaryotes, for ranges of k which may vary from 7 to 10 as chromosomal lengths vary from 2Mbp to 200 Mbp. Building on this formalism we introduce the concept of k-mer distances between chromosomes. We formulate two distance measures, D1 and D2, where the first takes into account k-mers appearing on single strands of the two chromosomes, whereas the second takes into account both strands.ResultsWe first define the various distance measures and summarize their properties. We also define distances that rely on existence of synteny blocks between chromosomes of different strains. Studying E Coli and Salmonella strains, we evaluate the different distance measures, and find correlations between synteny distances and k-mer distances, thus establishing the usefulness of the latter as measures of evolutional proximity of chromosomes. Applying our measures to human genomes, we find that chromosomes 5 and 6 are the closest ones on the k-mer distance evolutional scale.ConclusionsThe novel distances carry information about evolutional proximity and provide useful tools for future studies. The finding of proximity between human chromosomes 5 and 6 is an examples of a novel insight provided by these tools.


2020 ◽  
Vol 11 ◽  
Author(s):  
Zhe Yu ◽  
Chunfang Zheng ◽  
Victor A. Albert ◽  
David Sankoff

We take advantage of synteny blocks, the analytical construct enabled at the evolutionary moment of speciation or polyploidization, to follow the independent loss of duplicate genes in two sister species or the loss through fractionation of syntenic paralogs in a doubled genome. By examining how much sequence remains after a contiguous series of genes is deleted, we find that this residue remains at a constant low level independent of how many genes are lost—there are few if any relics of the missing sequence. Pseudogenes are rare or extremely transient in this context. The potential exceptions lie exclusively with a few examples of speciation, where the synteny blocks in some larger genomes tolerate degenerate sequence during genomic divergence of two species, but not after whole genome doubling in the same species where fractionation pressure eliminates virtually all non-coding sequence.


2019 ◽  
Vol 29 (4) ◽  
pp. 576-589 ◽  
Author(s):  
Marta Farré ◽  
Jaebum Kim ◽  
Anastasia A. Proskuryakova ◽  
Yang Zhang ◽  
Anastasia I. Kulemzina ◽  
...  

2019 ◽  
Author(s):  
Sundar Ram Sankaranarayanan ◽  
Giuseppe Ianiri ◽  
Md. Hashim Reza ◽  
Bhagya C. Thimmappa ◽  
Promit Ganguly ◽  
...  

AbstractIntra-chromosomal or inter-chromosomal genomic rearrangements often lead to speciation (1). Loss or gain of a centromere leads to alterations in chromosome number in closely related species. Thus, centromeres can enable tracing the path of evolution from the ancestral to a derived state (2). The Malassezia species complex of the phylum Basiodiomycota shows remarkable diversity in chromosome number ranging between six and nine chromosomes (3–5). To understand these transitions, we experimentally identified all eight centromeres as binding sites of an evolutionarily conserved outer kinetochore protein Mis12/Mtw1 in M. sympodialis. The 3 to 5 kb centromere regions share an AT-rich, poorly transcribed core region enriched with a 12 bp consensus motif. We also mapped nine such AT-rich centromeres in M. globosa and the related species Malassezia restricta and Malassezia slooffiae. While eight predicted centromeres were found within conserved synteny blocks between these species and M. sympodialis, the remaining centromere in M. globosa (MgCEN2) or its orthologous centromere in M. slooffiae (MslCEN4) and M. restricta (MreCEN8) mapped to a synteny breakpoint compared with M. sympodialis. Taken together, we provide evidence that breakage and loss of a centromere (CEN2) in an ancestral Malassezia species possessing nine chromosomes resulted in fewer chromosomes in M. sympodialis. Strikingly, the predicted centromeres of all closely related Malassezia species map to an AT-rich core on each chromosome that also shows enrichment of the 12 bp sequence motif. We propose that centromeres are fragile AT-rich sites driving karyotype diversity through breakage and inactivation in these and other species.Significance statementThe number of chromosomes can vary between closely related species. Centromere loss destabilizes chromosomes and results in reduced number of chromosomes to drive speciation. A series of evidence from studies on various cancers suggest that an imbalance in kinetochore-microtubule attachments results in breaks at the centromeres. To understand if such events can cause chromosome number changes in nature, we studied six species of Malassezia, of which three possess eight chromosomes and others have nine chromosomes each. We find signatures of chromosome breakage at the centromeres in organisms having nine chromosomes. We propose that the break at the centromere followed by fusions of acentric chromosomes to other chromosomes could be a plausible mechanism shaping the karyotype of Malassezia and related organisms.ClassificationBiological sciences, Genetics


PLoS ONE ◽  
2017 ◽  
Vol 12 (7) ◽  
pp. e0180198 ◽  
Author(s):  
Joseph MEX Lucas ◽  
Hugues Roest Crollius

Sign in / Sign up

Export Citation Format

Share Document