scholarly journals Super short operations on both gene order and intergenic sizes

2019 ◽  
Vol 14 (1) ◽  
Author(s):  
Andre R. Oliveira ◽  
Géraldine Jean ◽  
Guillaume Fertin ◽  
Ulisses Dias ◽  
Zanoni Dias

Abstract Background The evolutionary distance between two genomes can be estimated by computing a minimum length sequence of operations, called genome rearrangements, that transform one genome into another. Usually, a genome is modeled as an ordered sequence of genes, and most of the studies in the genome rearrangement literature consist in shaping biological scenarios into mathematical models. For instance, allowing different genome rearrangements operations at the same time, adding constraints to these rearrangements (e.g., each rearrangement can affect at most a given number of genes), considering that a rearrangement implies a cost depending on its length rather than a unit cost, etc. Most of the works, however, have overlooked some important features inside genomes, such as the presence of sequences of nucleotides between genes, called intergenic regions. Results and conclusions In this work, we investigate the problem of computing the distance between two genomes, taking into account both gene order and intergenic sizes. The genome rearrangement operations we consider here are constrained types of reversals and transpositions, called super short reversals (SSRs) and super short transpositions (SSTs), which affect up to two (consecutive) genes. We denote by super short operations (SSOs) any SSR or SST. We show 3-approximation algorithms when the orientation of the genes is not considered when we allow SSRs, SSTs, or SSOs, and 5-approximation algorithms when considering the orientation for either SSRs or SSOs. We also show that these algorithms improve their approximation factors when the input permutation has a higher number of inversions, where the approximation factor decreases from 3 to either 2 or 1.5, and from 5 to either 3 or 2.

2021 ◽  
Vol 16 (1) ◽  
Author(s):  
Klairton L. Brito ◽  
Andre R. Oliveira ◽  
Alexsandro O. Alexandrino ◽  
Ulisses Dias ◽  
Zanoni Dias

Abstract Background In the comparative genomics field, one of the goals is to estimate a sequence of genetic changes capable of transforming a genome into another. Genome rearrangement events are mutations that can alter the genetic content or the arrangement of elements from the genome. Reversal and transposition are two of the most studied genome rearrangement events. A reversal inverts a segment of a genome while a transposition swaps two consecutive segments. Initial studies in the area considered only the order of the genes. Recent works have incorporated other genetic information in the model. In particular, the information regarding the size of intergenic regions, which are structures between each pair of genes and in the extremities of a linear genome. Results and conclusions In this work, we investigate the sorting by intergenic reversals and transpositions problem on genomes sharing the same set of genes, considering the cases where the orientation of genes is known and unknown. Besides, we explored a variant of the problem, which generalizes the transposition event. As a result, we present an approximation algorithm that guarantees an approximation factor of 4 for both cases considering the reversal and transposition (classic definition) events, an improvement from the 4.5-approximation previously known for the scenario where the orientation of the genes is unknown. We also present a 3-approximation algorithm by incorporating the generalized transposition event, and we propose a greedy strategy to improve the performance of the algorithms. We performed practical tests adopting simulated data which indicated that the algorithms, in both cases, tend to perform better when compared with the best-known algorithms for the problem. Lastly, we conducted experiments using real genomes to demonstrate the applicability of the algorithms.


2019 ◽  
Vol 35 (14) ◽  
pp. i117-i126 ◽  
Author(s):  
Krister M Swenson ◽  
Mathieu Blanchette

Abstract Motivation Genome rearrangements drastically change gene order along great stretches of a chromosome. There has been initial evidence that these apparently non-local events in the 1D sense may have breakpoints that are close in the 3D sense. We harness the power of the Double Cut and Join model of genome rearrangement, along with Hi-C chromosome conformation capture data to test this hypothesis between human and mouse. Results We devise novel statistical tests that show that indeed, rearrangement scenarios that transform the human into the mouse gene order are enriched for pairs of breakpoints that have frequent chromosome interactions. This is observed for both intra-chromosomal breakpoint pairs, as well as for inter-chromosomal pairs. For intra-chromosomal rearrangements, the enrichment exists from close (<20 Mb) to very distant (100 Mb) pairs. Further, the pattern exists across multiple cell lines in Hi-C data produced by different laboratories and at different stages of the cell cycle. We show that similarities in the contact frequencies between these many experiments contribute to the enrichment. We conclude that either (i) rearrangements usually involve breakpoints that are spatially close or (ii) there is selection against rearrangements that act on spatially distant breakpoints. Availability and implementation Our pipeline is freely available at https://bitbucket.org/thekswenson/locality. Supplementary information Supplementary data are available at Bioinformatics online.


2017 ◽  
Vol 15 (01) ◽  
pp. 1750002 ◽  
Author(s):  
Carla Negri Lintzmayer ◽  
Guillaume Fertin ◽  
Zanoni Dias

Some interesting combinatorial problems have been motivated by genome rearrangements, which are mutations that affect large portions of a genome. When we represent genomes as permutations, the goal is to transform a given permutation into the identity permutation with the minimum number of rearrangements. When they affect segments from the beginning (respectively end) of the permutation, they are called prefix (respectively suffix) rearrangements. This paper presents results for rearrangement problems that involve prefix and suffix versions of reversals and transpositions considering unsigned and signed permutations. We give 2-approximation and ([Formula: see text])-approximation algorithms for these problems, where [Formula: see text] is a constant divided by the number of breakpoints (pairs of consecutive elements that should not be consecutive in the identity permutation) in the input permutation. We also give bounds for the diameters concerning these problems and provide ways of improving the practical results of our algorithms.


Author(s):  
Jing Tang ◽  
Xueyan Tang ◽  
Andrew Lim ◽  
Kai Han ◽  
Chongshou Li ◽  
...  

Monotone submodular maximization with a knapsack constraint is NP-hard. Various approximation algorithms have been devised to address this optimization problem. In this paper, we revisit the widely known modified greedy algorithm. First, we show that this algorithm can achieve an approximation factor of 0.405, which significantly improves the known factors of 0.357 given by Wolsey and (1-1/e)/2\approx 0.316 given by Khuller et al. More importantly, our analysis closes a gap in Khuller et al.'s proof for the extensively mentioned approximation factor of (1-1/\sqrte )\approx 0.393 in the literature to clarify a long-standing misconception on this issue. Second, we enhance the modified greedy algorithm to derive a data-dependent upper bound on the optimum. We empirically demonstrate the tightness of our upper bound with a real-world application. The bound enables us to obtain a data-dependent ratio typically much higher than 0.405 between the solution value of the modified greedy algorithm and the optimum. It can also be used to significantly improve the efficiency of algorithms such as branch and bound.


2020 ◽  
Vol 18 (02) ◽  
pp. 2050006 ◽  
Author(s):  
Alexsandro Oliveira Alexandrino ◽  
Carla Negri Lintzmayer ◽  
Zanoni Dias

One of the main problems in Computational Biology is to find the evolutionary distance among species. In most approaches, such distance only involves rearrangements, which are mutations that alter large pieces of the species’ genome. When we represent genomes as permutations, the problem of transforming one genome into another is equivalent to the problem of Sorting Permutations by Rearrangement Operations. The traditional approach is to consider that any rearrangement has the same probability to happen, and so, the goal is to find a minimum sequence of operations which sorts the permutation. However, studies have shown that some rearrangements are more likely to happen than others, and so a weighted approach is more realistic. In a weighted approach, the goal is to find a sequence which sorts the permutations, such that the cost of that sequence is minimum. This work introduces a new type of cost function, which is related to the amount of fragmentation caused by a rearrangement. We present some results about the lower and upper bounds for the fragmentation-weighted problems and the relation between the unweighted and the fragmentation-weighted approach. Our main results are 2-approximation algorithms for five versions of this problem involving reversals and transpositions. We also give bounds for the diameters concerning these problems and provide an improved approximation factor for simple permutations considering transpositions.


Development ◽  
2021 ◽  
Author(s):  
Zoe L. Grant ◽  
Peter F. Hickey ◽  
Waruni Abeysekera ◽  
Lachlan Whitehead ◽  
Sabrina M. Lewis ◽  
...  

Blood vessel growth and remodelling are essential during embryonic development and disease pathogenesis. The diversity of endothelial cells (ECs) is transcriptionally evident and ECs undergo dynamic changes in gene expression during vessel growth and remodelling. Here, we investigated the role of the histone acetyltransferase HBO1 (KAT7), which is important for activating genes during development and histone H3 lysine 14 acetylation (H3K14ac). Loss of HBO1 and H3K14ac impaired developmental sprouting angiogenesis and reduced pathological EC overgrowth in the retinal endothelium. Single-cell RNA-sequencing of retinal ECs revealed an increased abundance of tip cells in Hbo1 deleted retinas, which lead to EC overcrowding in the retinal sprouting front and prevented efficient tip cell migration. We found that H3K14ac was highly abundant in the endothelial genome in both intra- and intergenic regions suggesting that the role of HBO1 is as a genome organiser that promotes efficient tip cell behaviour necessary for sprouting angiogenesis.


2013 ◽  
Vol 23 (06) ◽  
pp. 461-477 ◽  
Author(s):  
MINATI DE ◽  
GAUTAM K. DAS ◽  
PAZ CARMI ◽  
SUBHAS C. NANDY

In this paper, we consider constant factor approximation algorithms for a variant of the discrete piercing set problem for unit disks. Here a set of points P is given; the objective is to choose minimum number of points in P to pierce the unit disks centered at all the points in P. We first propose a very simple algorithm that produces 12-approximation result in O(n log n) time. Next, we improve the approximation factor to 4 and then to 3. The worst case running time of these algorithms are O(n8 log n) and O(n15 log n) respectively. Apart from the space required for storing the input, the extra work-space requirement for each of these algorithms is O(1). Finally, we propose a PTAS for the same problem. Given a positive integer k, it can produce a solution with performance ratio [Formula: see text] in nO(k) time.


2019 ◽  
Vol 47 (18) ◽  
pp. 9741-9760 ◽  
Author(s):  
V Talya Yerlici ◽  
Michael W Lu ◽  
Carla R Hoge ◽  
Richard V Miller ◽  
Rafik Neme ◽  
...  

Abstract Extrachromosomal circular DNA (eccDNA) is both a driver of eukaryotic genome instability and a product of programmed genome rearrangements, but its extent had not been surveyed in Oxytricha, a ciliate with elaborate DNA elimination and translocation during development. Here, we captured rearrangement-specific circular DNA molecules across the genome to gain insight into its processes of programmed genome rearrangement. We recovered thousands of circularly excised Tc1/mariner-type transposable elements and high confidence non-repetitive germline-limited loci. We verified their bona fide circular topology using circular DNA deep-sequencing, 2D gel electrophoresis and inverse polymerase chain reaction. In contrast to the precise circular excision of transposable elements, we report widespread heterogeneity in the circular excision of non-repetitive germline-limited loci. We also demonstrate that circular DNAs are transcribed in Oxytricha, producing rearrangement-specific long non-coding RNAs. The programmed formation of thousands of eccDNA molecules makes Oxytricha a model system for studying nucleic acid topology. It also suggests involvement of eccDNA in programmed genome rearrangement.


Sign in / Sign up

Export Citation Format

Share Document