synteny block
Recently Published Documents


TOTAL DOCUMENTS

14
(FIVE YEARS 8)

H-INDEX

3
(FIVE YEARS 1)

2021 ◽  
Vol 12 ◽  
Author(s):  
Zaira Seferbekova ◽  
Alexey Zabelkin ◽  
Yulia Yakovleva ◽  
Robert Afasizhev ◽  
Natalia O. Dranenko ◽  
...  

Shigella are pathogens originating within the Escherichia lineage but frequently classified as a separate genus. Shigella genomes contain numerous insertion sequences (ISs) that lead to pseudogenisation of affected genes and an increase of non-homologous recombination. Here, we study 414 genomes of E. coli and Shigella strains to assess the contribution of genomic rearrangements to Shigella evolution. We found that Shigella experienced exceptionally high rates of intragenomic rearrangements and had a decreased rate of homologous recombination compared to pathogenic and non-pathogenic E. coli. The high rearrangement rate resulted in independent disruption of syntenic regions and parallel rearrangements in different Shigella lineages. Specifically, we identified two types of chromosomally encoded E3 ubiquitin-protein ligases acquired independently by all Shigella strains that also showed a high level of sequence conservation in the promoter and further in the 5′-intergenic region. In the only available enteroinvasive E. coli (EIEC) strain, which is a pathogenic E. coli with a phenotype intermediate between Shigella and non-pathogenic E. coli, we found a rate of genome rearrangements comparable to those in other E. coli and no functional copies of the two Shigella-specific E3 ubiquitin ligases. These data indicate that the accumulation of ISs influenced many aspects of genome evolution and played an important role in the evolution of intracellular pathogens. Our research demonstrates the power of comparative genomics-based on synteny block composition and an important role of non-coding regions in the evolution of genomic islands.


2020 ◽  
Author(s):  
Nicola De Maio

Abstract Sequence alignment is essential for phylogenetic and molecular evolution inference, as well as in many other areas of bioinformatics and evolutionary biology. Inaccurate alignments can lead to severe biases in most downstream statistical analyses. Statistical alignment based on probabilistic models of sequence evolution addresses these issues by replacing heuristic score functions with evolutionary model-based probabilities. However, score-based aligners and fixed-alignment phylogenetic approaches are still more prevalent than methods based on evolutionary indel models, mostly due to computational convenience. Here, I present new techniques for improving the accuracy and speed of statistical evolutionary alignment. The “cumulative indel model” approximates realistic evolutionary indel dynamics using differential equations. “Adaptive banding” reduces the computational demand of most alignment algorithms without requiring prior knowledge of divergence levels or pseudo-optimal alignments. Using simulations, I show that these methods lead to fast and accurate pairwise alignment inference. Also, I show that it is possible, with these methods, to align and infer evolutionary parameters from a single long synteny block ($\approx$530 kbp) between the human and chimp genomes. The cumulative indel model and adaptive banding can therefore improve the performance of alignment and phylogenetic methods. [Evolutionary alignment; pairHMM; sequence evolution; statistical alignment; statistical genetics.]


GigaScience ◽  
2020 ◽  
Vol 9 (6) ◽  
Author(s):  
Ksenia Krasheninnikova ◽  
Mark Diekhans ◽  
Joel Armstrong ◽  
Aleksei Dievskii ◽  
Benedict Paten ◽  
...  

Abstract Background Large-scale sequencing projects provide high-quality full-genome data that can be used for reconstruction of chromosomal exchanges and rearrangements that disrupt conserved syntenic blocks. The highest resolution of cross-species homology can be obtained on the basis of whole-genome, reference-free alignments. Very large multiple alignments of full-genome sequence stored in a binary format demand an accurate and efficient computational approach for synteny block production. Findings halSynteny performs efficient processing of pairwise alignment blocks for any pair of genomes in the alignment. The tool is part of the HAL comparative genomics suite and is targeted to build synteny blocks for multi-hundred–way, reference-free vertebrate alignments built with the Cactus system. Conclusions halSynteny enables an accurate and rapid identification of synteny in multiple full-genome alignments. The method is implemented in C++11 as a component of the halTools software and released under MIT license. The package is available at https://github.com/ComparativeGenomicsToolkit/hal/.


2020 ◽  
Vol 37 (9) ◽  
pp. 2747-2762 ◽  
Author(s):  
Guénola Drillon ◽  
Raphaël Champeimont ◽  
Francesco Oteri ◽  
Gilles Fischer ◽  
Alessandra Carbone

Abstract Gene order can be used as an informative character to reconstruct phylogenetic relationships between species independently from the local information present in gene/protein sequences. PhyChro is a reconstruction method based on chromosomal rearrangements, applicable to a wide range of eukaryotic genomes with different gene contents and levels of synteny conservation. For each synteny breakpoint issued from pairwise genome comparisons, the algorithm defines two disjoint sets of genomes, named partial splits, respectively, supporting the two block adjacencies defining the breakpoint. Considering all partial splits issued from all pairwise comparisons, a distance between two genomes is computed from the number of partial splits separating them. Tree reconstruction is achieved through a bottom-up approach by iteratively grouping sister genomes minimizing genome distances. PhyChro estimates branch lengths based on the number of synteny breakpoints and provides confidence scores for the branches. PhyChro performance is evaluated on two data sets of 13 vertebrates and 21 yeast genomes by using up to 130,000 and 179,000 breakpoints, respectively, a scale of genomic markers that has been out of reach until now. PhyChro reconstructs very accurate tree topologies even at known problematic branching positions. Its robustness has been benchmarked for different synteny block reconstruction methods. On simulated data PhyChro reconstructs phylogenies perfectly in almost all cases, and shows the highest accuracy compared with other existing tools. PhyChro is very fast, reconstructing the vertebrate and yeast phylogenies in <15 min.


2020 ◽  
Vol 36 (13) ◽  
pp. 3966-3974
Author(s):  
Ryo Nakabayashi ◽  
Shinichi Morishita

Abstract Motivation De novo assembly of reference-quality genomes used to require enormously laborious tasks. In particular, it is extremely time-consuming to build genome markers for ordering assembled contigs along chromosomes; thus, they are only available for well-established model organisms. To resolve this issue, recent studies demonstrated that Hi-C could be a powerful and cost-effective means to output chromosome-length scaffolds for non-model species with no genome marker resources, because the Hi-C contact frequency between a pair of two loci can be a good estimator of their genomic distance, even if there is a large gap between them. Indeed, state-of-the-art methods such as 3D-DNA are now widely used for locating contigs in chromosomes. However, it remains challenging to reduce errors in contig orientation because shorter contigs have fewer contacts with their neighboring contigs. These orientation errors lower the accuracy of gene prediction, read alignment, and synteny block estimation in comparative genomics. Results To reduce these contig orientation errors, we propose a new algorithm, named HiC-Hiker, which has a firm grounding in probabilistic theory, rigorously models Hi-C contacts across contigs, and effectively infers the most probable orientations via the Viterbi algorithm. We compared HiC-Hiker and 3D-DNA using human and worm genome contigs generated from short reads, evaluated their performances, and observed a remarkable reduction in the contig orientation error rate from 4.3% (3D-DNA) to 1.7% (HiC-Hiker). Our algorithm can consider long-range information between distal contigs and precisely estimates Hi-C read contact probabilities among contigs, which may also be useful for determining the ordering of contigs. Availability and implementation HiC-Hiker is freely available at: https://github.com/ryought/hic_hiker.


2020 ◽  
Author(s):  
Qiang Yang ◽  
zhiming zheng ◽  
Hui Liu ◽  
Peng Wang ◽  
Li Wang ◽  
...  

Abstract Background The species in family Elizabethkingia meningoseptica are interesting strain for investigating Vitamin K2 metabolic analysis. However, their genomic sequence, metabolic pathway, potential abilities, and evolutionary status are still unknown. Results This study therefore aimed to perform a genome sequencing of Elizabethkingia meningoseptica sp. F2 and further accomplished comparative analysis with other Vitamin K2 strains reveals overall identifying its unique/shared metabolic genes across genomes. The 3,874,794–base pair sequence of Elizabethkingia meningoseptica sp. F2 is presented. Of 3,539 genes annotation was applied. Results of synteny block demonstrated Elizabethkingia meningoseptica sp. F2 shares high levels of synteny with Elizabethkingia meningoseptica ATCC 13253 and Elizabethkingia meningoseptica NBRC 12535. Identification of Vitamin K2 metabolic pathway in Elizabethkingia meningoseptica sp. F2 were also accomplished. In addition, Elizabethkingia meningoseptica sp. F2 was resistant to gentamicin, streptomycin, ampicillin and caramycin, consistent with the presence of multiple genes encoding diverse multidrug efflux pump protein in the genome. Furthermore, By co-overexpression experiments of MenA and MenG, we showed that Vitamin K2 content was enhanced by 37% compared with control strain. Conclusions The genome analysis of Elizabethkingia meningoseptica sp. F2 in conjunction with the comparative metabolic pathways analysis among the E.coli, Bacillus subtilis and Streptomyces provided a useful information on the Vitamin K2 biosynthetic pathway and other related pathways at systems level.


2019 ◽  
Author(s):  
Guénola Drillon ◽  
Raphaël Champeimont ◽  
Francesco Oteri ◽  
Gilles Fischer ◽  
Alessandra Carbone

AbstractGene order can be used as an informative character to reconstruct phylogenetic relationships-between species independently from the local information present in gene/protein sequences.PhyChro is a reconstruction method based on chromosomal rearrangements, applicable to a wide range of eukaryotic genomes with different gene contents and levels of synteny conservation. For each synteny breakpoint issued from pairwise genome comparisons, the algorithm defines two disjoint sets of genomes, named partial splits, respectively supporting the two block adjacencies defining the breakpoint. Considering all partial splits issued from all pairwise comparisons, a distance between two genomes is computed from the number of partial splits separating them. Tree reconstruction is achieved through a bottom-up approach by iteratively grouping sister genomes minimizing genome distances. PhyChro estimates branch lengths based on the number of synteny breakpoints and provides confidence scores for the branches.PhyChro performance isevaluatedon two datasets of 13 vertebrates and 21 yeast genomes by using up to 130 000 and 179 000 breakpoints respectively, a scale of genomic markers that has been out of reach until now. PhyChro reconstructs very accurate tree topologies even at known problematic branching positions. Its robustness has been benchmarked for different synteny block reconstruction methods. On simulated data PhyChro reconstructs phylogenies perfectly in almost all cases, and shows the highest accuracy compared to other existing tools. PhyChro is very fast, reconstructing the vertebrate and yeast phylogenies in less than 15 min.AvailabilityPhyChro will be freely available under the BSD license after [email protected]


2019 ◽  
Author(s):  
John Herrick ◽  
Bianca Sclavi

AbstractEvolutionary changes in karyotype have long been implicated in speciation events; however, the phylogenetic relationship between karyotype diversity and species richness in closely and distantly related mammalian lineages remains to be fully elucidated. Here we examine the association between genome diversity and species diversity across the class Mammalia. We tested five different metrics of genome diversity: clade-average genome size, standard deviation of genome size, diploid and fundamental numbers (karyotype diversity), sub-chromosomal rearrangements and percent synteny block conservation. We found a significant association between species richness (phylogenetic clade diversity) and genome diversity at both order and family level clades. Karyotype diversity provided the strongest support for a relationship between genome diversity and species diversity. Our results suggest that lineage specific variations in genome and karyotype stability can account for different levels of species diversity in mammals.


2018 ◽  
Vol 19 (1) ◽  
Author(s):  
Jongin Lee ◽  
Daehwan Lee ◽  
Mikang Sim ◽  
Daehong Kwon ◽  
Juyeon Kim ◽  
...  

2016 ◽  
Vol 15 (4) ◽  
pp. 343-353 ◽  
Author(s):  
Jose A. Arjona-Medina ◽  
Oswaldo Trelles
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document