Super short operations on both gene order and intergenic sizes

Abstract Background The evolutionary distance between two genomes can be estimated by computing a minimum length sequence of operations, called genome rearrangements, that transform one genome into another. Usually, a genome is modeled as an ordered sequence of genes, and most of the studies in the genome rearrangement literature consist in shaping biological scenarios into mathematical models. For instance, allowing different genome rearrangements operations at the same time, adding constraints to these rearrangements (e.g., each rearrangement can affect at most a given number of genes), considering that a rearrangement implies a cost depending on its length rather than a unit cost, etc. Most of the works, however, have overlooked some important features inside genomes, such as the presence of sequences of nucleotides between genes, called intergenic regions. Results and conclusions In this work, we investigate the problem of computing the distance between two genomes, taking into account both gene order and intergenic sizes. The genome rearrangement operations we consider here are constrained types of reversals and transpositions, called super short reversals (SSRs) and super short transpositions (SSTs), which affect up to two (consecutive) genes. We denote by super short operations (SSOs) any SSR or SST. We show 3-approximation algorithms when the orientation of the genes is not considered when we allow SSRs, SSTs, or SSOs, and 5-approximation algorithms when considering the orientation for either SSRs or SSOs. We also show that these algorithms improve their approximation factors when the input permutation has a higher number of inversions, where the approximation factor decreases from 3 to either 2 or 1.5, and from 5 to either 3 or 2.

Download Full-text

An improved approximation algorithm for the reversal and transposition distance considering gene order and intergenic sizes

Algorithms for Molecular Biology ◽

10.1186/s13015-021-00203-7 ◽

2021 ◽

Vol 16 (1) ◽

Author(s):

Klairton L. Brito ◽

Andre R. Oliveira ◽

Alexsandro O. Alexandrino ◽

Ulisses Dias ◽

Zanoni Dias

Keyword(s):

Approximation Algorithm ◽

Genetic Information ◽

Genome Rearrangement ◽

Simulated Data ◽

Genetic Changes ◽

Greedy Strategy ◽

Approximation Factor ◽

Transposition Event ◽

A Genome ◽

Intergenic Regions

Abstract Background In the comparative genomics field, one of the goals is to estimate a sequence of genetic changes capable of transforming a genome into another. Genome rearrangement events are mutations that can alter the genetic content or the arrangement of elements from the genome. Reversal and transposition are two of the most studied genome rearrangement events. A reversal inverts a segment of a genome while a transposition swaps two consecutive segments. Initial studies in the area considered only the order of the genes. Recent works have incorporated other genetic information in the model. In particular, the information regarding the size of intergenic regions, which are structures between each pair of genes and in the extremities of a linear genome. Results and conclusions In this work, we investigate the sorting by intergenic reversals and transpositions problem on genomes sharing the same set of genes, considering the cases where the orientation of genes is known and unknown. Besides, we explored a variant of the problem, which generalizes the transposition event. As a result, we present an approximation algorithm that guarantees an approximation factor of 4 for both cases considering the reversal and transposition (classic definition) events, an improvement from the 4.5-approximation previously known for the scenario where the orientation of the genes is unknown. We also present a 3-approximation algorithm by incorporating the generalized transposition event, and we propose a greedy strategy to improve the performance of the algorithms. We performed practical tests adopting simulated data which indicated that the algorithms, in both cases, tend to perform better when compared with the best-known algorithms for the problem. Lastly, we conducted experiments using real genomes to demonstrate the applicability of the algorithms.

Download Full-text

Large-scale mammalian genome rearrangements coincide with chromatin interactions

Bioinformatics ◽

10.1093/bioinformatics/btz343 ◽

2019 ◽

Vol 35 (14) ◽

pp. i117-i126 ◽

Cited By ~ 1

Author(s):

Krister M Swenson ◽

Mathieu Blanchette

Keyword(s):

Gene Order ◽

Large Scale ◽

Genome Rearrangement ◽

Statistical Tests ◽

Chromosomal Rearrangements ◽

Genome Rearrangements ◽

Supplementary Information ◽

Chromosome Conformation ◽

Chromatin Interactions ◽

Multiple Cell

Abstract Motivation Genome rearrangements drastically change gene order along great stretches of a chromosome. There has been initial evidence that these apparently non-local events in the 1D sense may have breakpoints that are close in the 3D sense. We harness the power of the Double Cut and Join model of genome rearrangement, along with Hi-C chromosome conformation capture data to test this hypothesis between human and mouse. Results We devise novel statistical tests that show that indeed, rearrangement scenarios that transform the human into the mouse gene order are enriched for pairs of breakpoints that have frequent chromosome interactions. This is observed for both intra-chromosomal breakpoint pairs, as well as for inter-chromosomal pairs. For intra-chromosomal rearrangements, the enrichment exists from close (<20 Mb) to very distant (100 Mb) pairs. Further, the pattern exists across multiple cell lines in Hi-C data produced by different laboratories and at different stages of the cell cycle. We show that similarities in the contact frequencies between these many experiments contribute to the enrichment. We conclude that either (i) rearrangements usually involve breakpoints that are spatially close or (ii) there is selection against rearrangements that act on spatially distant breakpoints. Availability and implementation Our pipeline is freely available at https://bitbucket.org/thekswenson/locality. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Sorting permutations by prefix and suffix rearrangements

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720017500020 ◽

2017 ◽

Vol 15 (01) ◽

pp. 1750002 ◽

Cited By ~ 3

Author(s):

Carla Negri Lintzmayer ◽

Guillaume Fertin ◽

Zanoni Dias

Keyword(s):

Approximation Algorithms ◽

Genome Rearrangements ◽

Combinatorial Problems ◽

Signed Permutations ◽

Identity Permutation ◽

A Genome ◽

Minimum Number

Some interesting combinatorial problems have been motivated by genome rearrangements, which are mutations that affect large portions of a genome. When we represent genomes as permutations, the goal is to transform a given permutation into the identity permutation with the minimum number of rearrangements. When they affect segments from the beginning (respectively end) of the permutation, they are called prefix (respectively suffix) rearrangements. This paper presents results for rearrangement problems that involve prefix and suffix versions of reversals and transpositions considering unsigned and signed permutations. We give 2-approximation and ([Formula: see text])-approximation algorithms for these problems, where [Formula: see text] is a constant divided by the number of breakpoints (pairs of consecutive elements that should not be consecutive in the identity permutation) in the input permutation. We also give bounds for the diameters concerning these problems and provide ways of improving the practical results of our algorithms.

Download Full-text

Genome Rearrangements on Both Gene Order and Intergenic Regions

Lecture Notes in Computer Science - Algorithms in Bioinformatics ◽

10.1007/978-3-319-43681-4_13 ◽

2016 ◽

pp. 162-173 ◽

Cited By ~ 2

Author(s):

Guillaume Fertin ◽

Géraldine Jean ◽

Eric Tannier

Keyword(s):

Gene Order ◽

Genome Rearrangements ◽

Intergenic Regions

Download Full-text

Revisiting Modified Greedy Algorithm for Monotone Submodular Maximization with a Knapsack Constraint

Proceedings of the ACM on Measurement and Analysis of Computing Systems ◽

10.1145/3447386 ◽

2021 ◽

Vol 5 (1) ◽

pp. 1-22

Author(s):

Jing Tang ◽

Xueyan Tang ◽

Andrew Lim ◽

Kai Han ◽

Chongshou Li ◽

...

Keyword(s):

Approximation Algorithms ◽

Greedy Algorithm ◽

Branch And Bound ◽

Upper Bound ◽

Optimization Problem ◽

Approximation Factor ◽

Real World Application ◽

Efficiency Of Algorithms ◽

Knapsack Constraint ◽

Submodular Maximization

Monotone submodular maximization with a knapsack constraint is NP-hard. Various approximation algorithms have been devised to address this optimization problem. In this paper, we revisit the widely known modified greedy algorithm. First, we show that this algorithm can achieve an approximation factor of 0.405, which significantly improves the known factors of 0.357 given by Wolsey and (1-1/e)/2\approx 0.316 given by Khuller et al. More importantly, our analysis closes a gap in Khuller et al.'s proof for the extensively mentioned approximation factor of (1-1/\sqrte )\approx 0.393 in the literature to clarify a long-standing misconception on this issue. Second, we enhance the modified greedy algorithm to derive a data-dependent upper bound on the optimum. We empirically demonstrate the tightness of our upper bound with a real-world application. The bound enables us to obtain a data-dependent ratio typically much higher than 0.405 between the solution value of the modified greedy algorithm and the optimum. It can also be used to significantly improve the efficiency of algorithms such as branch and bound.

Download Full-text

Sorting permutations by fragmentation-weighted operations

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720020500067 ◽

2020 ◽

Vol 18 (02) ◽

pp. 2050006 ◽

Cited By ~ 1

Author(s):

Alexsandro Oliveira Alexandrino ◽

Carla Negri Lintzmayer ◽

Zanoni Dias

Keyword(s):

Approximation Algorithms ◽

Computational Biology ◽

Cost Function ◽

Traditional Approach ◽

Upper Bounds ◽

Evolutionary Distance ◽

Lower And Upper Bounds ◽

Approximation Factor ◽

New Type ◽

The Cost

One of the main problems in Computational Biology is to find the evolutionary distance among species. In most approaches, such distance only involves rearrangements, which are mutations that alter large pieces of the species’ genome. When we represent genomes as permutations, the problem of transforming one genome into another is equivalent to the problem of Sorting Permutations by Rearrangement Operations. The traditional approach is to consider that any rearrangement has the same probability to happen, and so, the goal is to find a minimum sequence of operations which sorts the permutation. However, studies have shown that some rearrangements are more likely to happen than others, and so a weighted approach is more realistic. In a weighted approach, the goal is to find a sequence which sorts the permutations, such that the cost of that sequence is minimum. This work introduces a new type of cost function, which is related to the amount of fragmentation caused by a rearrangement. We present some results about the lower and upper bounds for the fragmentation-weighted problems and the relation between the unweighted and the fragmentation-weighted approach. Our main results are 2-approximation algorithms for five versions of this problem involving reversals and transpositions. We also give bounds for the diameters concerning these problems and provide an improved approximation factor for simple permutations considering transpositions.

Download Full-text

The complete mitochondrial genome of the entomopathogenic fungus Metarhizium anisopliae var. anisopliae: gene order and trn gene clusters reveal a common evolutionary course for all Sordariomycetes, while intergenic regions show variation

Archives of Microbiology ◽

10.1007/s00203-006-0104-x ◽

2006 ◽

Vol 185 (5) ◽

pp. 393-401 ◽

Cited By ~ 41

Author(s):

Dimitri V. Ghikas ◽

Vassili N. Kouvelis ◽

Milton A. Typas

Keyword(s):

Mitochondrial Genome ◽

Gene Order ◽

Metarhizium Anisopliae ◽

Entomopathogenic Fungus ◽

Complete Mitochondrial Genome ◽

Gene Clusters ◽

Intergenic Regions

Download Full-text

The histone acetyltransferase HBO1 (KAT7) promotes efficient tip cell sprouting during angiogenesis

Development ◽

10.1242/dev.199581 ◽

2021 ◽

Author(s):

Zoe L. Grant ◽

Peter F. Hickey ◽

Waruni Abeysekera ◽

Lachlan Whitehead ◽

Sabrina M. Lewis ◽

...

Keyword(s):

Histone Acetyltransferase ◽

Dynamic Changes ◽

Cell Behaviour ◽

Tip Cell ◽

Sprouting Angiogenesis ◽

A Genome ◽

Vessel Growth ◽

Intergenic Regions ◽

Tip Cells

Blood vessel growth and remodelling are essential during embryonic development and disease pathogenesis. The diversity of endothelial cells (ECs) is transcriptionally evident and ECs undergo dynamic changes in gene expression during vessel growth and remodelling. Here, we investigated the role of the histone acetyltransferase HBO1 (KAT7), which is important for activating genes during development and histone H3 lysine 14 acetylation (H3K14ac). Loss of HBO1 and H3K14ac impaired developmental sprouting angiogenesis and reduced pathological EC overgrowth in the retinal endothelium. Single-cell RNA-sequencing of retinal ECs revealed an increased abundance of tip cells in Hbo1 deleted retinas, which lead to EC overcrowding in the retinal sprouting front and prevented efficient tip cell migration. We found that H3K14ac was highly abundant in the endothelial genome in both intra- and intergenic regions suggesting that the role of HBO1 is as a genome organiser that promotes efficient tip cell behaviour necessary for sprouting angiogenesis.

Download Full-text

APPROXIMATION ALGORITHMS FOR A VARIANT OF DISCRETE PIERCING SET PROBLEM FOR UNIT DISKS

International Journal of Computational Geometry & Applications ◽

10.1142/s021819591350009x ◽

2013 ◽

Vol 23 (06) ◽

pp. 461-477 ◽

Cited By ~ 7

Author(s):

MINATI DE ◽

GAUTAM K. DAS ◽

PAZ CARMI ◽

SUBHAS C. NANDY

Keyword(s):

Approximation Algorithms ◽

Simple Algorithm ◽

Constant Factor ◽

Performance Ratio ◽

Approximation Result ◽

Worst Case ◽

Approximation Factor ◽

Minimum Number ◽

Unit Disks ◽

Set Of Points

In this paper, we consider constant factor approximation algorithms for a variant of the discrete piercing set problem for unit disks. Here a set of points P is given; the objective is to choose minimum number of points in P to pierce the unit disks centered at all the points in P. We first propose a very simple algorithm that produces 12-approximation result in O(n log n) time. Next, we improve the approximation factor to 4 and then to 3. The worst case running time of these algorithms are O(n8 log n) and O(n15 log n) respectively. Apart from the space required for storing the input, the extra work-space requirement for each of these algorithms is O(1). Finally, we propose a PTAS for the same problem. Given a positive integer k, it can produce a solution with performance ratio [Formula: see text] in nO(k) time.

Download Full-text

Programmed genome rearrangements in Oxytricha produce transcriptionally active extrachromosomal circular DNA

Nucleic Acids Research ◽

10.1093/nar/gkz725 ◽

2019 ◽

Vol 47 (18) ◽

pp. 9741-9760 ◽

Cited By ~ 3

Author(s):

V Talya Yerlici ◽

Michael W Lu ◽

Carla R Hoge ◽

Richard V Miller ◽

Rafik Neme ◽

...

Keyword(s):

Transposable Elements ◽

Genome Rearrangement ◽

Genome Instability ◽

Genome Rearrangements ◽

Circular Dna ◽

Dna Elimination ◽

Circular Dnas ◽

Bona Fide ◽

Polymerase Chain ◽

2D Gel

Abstract Extrachromosomal circular DNA (eccDNA) is both a driver of eukaryotic genome instability and a product of programmed genome rearrangements, but its extent had not been surveyed in Oxytricha, a ciliate with elaborate DNA elimination and translocation during development. Here, we captured rearrangement-specific circular DNA molecules across the genome to gain insight into its processes of programmed genome rearrangement. We recovered thousands of circularly excised Tc1/mariner-type transposable elements and high confidence non-repetitive germline-limited loci. We verified their bona fide circular topology using circular DNA deep-sequencing, 2D gel electrophoresis and inverse polymerase chain reaction. In contrast to the precise circular excision of transposable elements, we report widespread heterogeneity in the circular excision of non-repetitive germline-limited loci. We also demonstrate that circular DNAs are transcribed in Oxytricha, producing rearrangement-specific long non-coding RNAs. The programmed formation of thousands of eccDNA molecules makes Oxytricha a model system for studying nucleic acid topology. It also suggests involvement of eccDNA in programmed genome rearrangement.

Download Full-text