OPTIMAL, EFFICIENT RECONSTRUCTION OF PHYLOGENETIC NETWORKS WITH CONSTRAINED RECOMBINATION

A phylogenetic network is a generalization of a phylogenetic tree, allowing structural properties that are not tree-like. In a seminal paper, Wang et al.1 studied the problem of constructing a phylogenetic network, allowing recombination between sequences, with the constraint that the resulting cycles must be disjoint. We call such a phylogenetic network a "galled-tree". They gave a polynomial-time algorithm that was intended to determine whether or not a set of sequences could be generated on galled-tree. Unfortunately, the algorithm by Wang et al.1 is incomplete and does not constitute a necessary test for the existence of a galled-tree for the data. In this paper, we completely solve the problem. Moreover, we prove that if there is a galled-tree, then the one produced by our algorithm minimizes the number of recombinations over all phylogenetic networks for the data, even allowing multiple-crossover recombinations. We also prove that when there is a galled-tree for the data, the galled-tree minimizing the number of recombinations is "essentially unique". We also note two additional results: first, any set of sequences that can be derived on a galled tree can be derived on a true tree (without recombination cycles), where at most one back mutation per site is allowed; second, the site compatibility problem (which is NP-hard in general) can be solved in polynomial time for any set of sequences that can be derived on a galled tree. Perhaps more important than the specific results about galled-trees, we introduce an approach that can be used to study recombination in general phylogenetic networks. This paper greatly extends the conference version that appears in an earlier work.8 PowerPoint slides of the conference talk can be found at our website.7

Download Full-text

IT IS NL-COMPLETE TO DECIDE WHETHER A HAIRPIN COMPLETION OF REGULAR LANGUAGES IS REGULAR

International Journal of Foundations of Computer Science ◽

10.1142/s0129054111009057 ◽

2011 ◽

Vol 22 (08) ◽

pp. 1813-1828 ◽

Cited By ~ 1

Author(s):

VOLKER DIEKERT ◽

STEFFEN KOPECKI

Keyword(s):

Polynomial Time ◽

Dna Computing ◽

Regular Language ◽

Decision Problem ◽

Polynomial Time Algorithm ◽

Time Algorithm ◽

Complexity Bound ◽

The One ◽

Context Free ◽

Hairpin Formation

The hairpin completion is an operation on formal languages which is inspired by the hairpin formation in biochemistry. Hairpin formations occur naturally within DNA-computing. It has been known that the hairpin completion of a regular language is linear context-free, but not regular, in general. However, for some time it is was open whether the regularity of the hairpin completion of a regular language is decidable. In 2009 this decidability problem has been solved positively in [5] by providing a polynomial time algorithm. In this paper we improve the complexity bound by showing that the decision problem is actually NL-complete. This complexity bound holds for both, the one-sided and the two-sided hairpin completions.

Download Full-text

Arc-Completion of 2-Colored Best Match Graphs to Binary-Explainable Best Match Graphs

Algorithms ◽

10.3390/a14040110 ◽

2021 ◽

Vol 14 (4) ◽

pp. 110

Author(s):

David Schaller ◽

Manuela Geiß ◽

Marc Hellmuth ◽

Peter F. Stadler

Keyword(s):

Phylogenetic Tree ◽

Polynomial Time ◽

Binary Tree ◽

A Priori ◽

Polynomial Time Algorithm ◽

Time Algorithm ◽

Minimum Cardinality ◽

Mathematical Phylogenetics ◽

Special Case

Best match graphs (BMGs) are vertex-colored digraphs that naturally arise in mathematical phylogenetics to formalize the notion of evolutionary closest genes w.r.t. an a priori unknown phylogenetic tree. BMGs are explained by unique least resolved trees. We prove that the property of a rooted, leaf-colored tree to be least resolved for some BMG is preserved by the contraction of inner edges. For the special case of two-colored BMGs, this leads to a characterization of the least resolved trees (LRTs) of binary-explainable trees and a simple, polynomial-time algorithm for the minimum cardinality completion of the arc set of a BMG to reach a BMG that can be explained by a binary tree.

Download Full-text

Display Sets of Normal and Tree-Child Networks

The Electronic Journal of Combinatorics ◽

10.37236/9128 ◽

2021 ◽

Vol 28 (1) ◽

Author(s):

Janosch Döcker ◽

Simone Linz ◽

Charles Semple

Keyword(s):

Decision Problem ◽

Phylogenetic Trees ◽

Phylogenetic Network ◽

Polynomial Time Algorithm ◽

Time Algorithm ◽

Directed Acyclic Graphs ◽

Phylogenetic Networks ◽

Acyclic Graphs ◽

Normal Network ◽

Normal Networks

Phylogenetic networks are leaf-labelled directed acyclic graphs that are used in computational biology to analyse and represent the evolutionary relationships of a set of species or viruses. In contrast to phylogenetic trees, phylogenetic networks have vertices of in-degree at least two that represent reticulation events such as hybridisation, lateral gene transfer, or reassortment. By systematically deleting various combinations of arcs in a phylogenetic network $\mathcal N$, one derives a set of phylogenetic trees that are embedded in $\mathcal N$. We recently showed that the problem of deciding if two binary phylogenetic networks embed the same set of phylogenetic trees is computationally hard, in particular, we showed it to be $\Pi^P_2$-complete. In this paper, we establish a polynomial-time algorithm for this decision problem if the initial two networks consist of a normal network and a tree-child network; two well-studied topologically restricted subclasses of phylogenetic networks, with normal networks being more structurally constrained than tree-child networks. The running time of the algorithm is quadratic in the size of the leaf sets.

Download Full-text

RECONSTRUCTING AN ULTRAMETRIC GALLED PHYLOGENETIC NETWORK FROM A DISTANCE MATRIX

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720006002211 ◽

2006 ◽

Vol 04 (04) ◽

pp. 807-832 ◽

Cited By ~ 15

Author(s):

HO-LEUNG CHAN ◽

JESPER JANSSON ◽

TAK-WAH LAM ◽

SIU-MING YIU

Keyword(s):

Phylogenetic Tree ◽

Phylogenetic Network ◽

Distance Matrix ◽

Time Algorithm ◽

Phylogenetic Networks ◽

Np Hard ◽

Reconstruction Problem ◽

Tree Reconstruction ◽

Phylogenetic Tree Reconstruction

Given a distance matrix M that specifies the pairwise evolutionary distances between n species, the phylogenetic tree reconstruction problem asks for an edge-weighted phylogenetic tree that satisfies M, if one exists. We study some extensions of this problem to rooted phylogenetic networks. Our main result is an O(n2 log n)-time algorithm for determining whether there is an ultrametric galled network that satisfies M, and if so, constructing one. In fact, if such an ultrametric galled network exists, our algorithm is guaranteed to construct one containing the minimum possible number of nodes with more than one parent (hybrid nodes). We also prove that finding a largest possible submatrix M′ of M such that there exists an ultrametric galled network that satisfies M′ is NP-hard. Furthermore, we show that given an incomplete distance matrix (i.e. where some matrix entries are missing), it is also NP-hard to determine whether there exists an ultrametric galled network which satisfies it.

Download Full-text

A Polynomial-Time Algorithm for Minimizing the Deep Coalescence Cost for Level-1 Species Networks

10.1101/2020.11.04.368845 ◽

2020 ◽

Author(s):

Matthew LeMay ◽

Ran Libeskind-Hadas ◽

Yi-Chieh Wu

Keyword(s):

Polynomial Time ◽

Incomplete Lineage Sorting ◽

Phylogenetic Analyses ◽

Polynomial Time Algorithm ◽

Time Algorithm ◽

Phylogenetic Networks ◽

Gene Trees ◽

Lineage Sorting ◽

Hybrid Species ◽

Level 1

Phylogenetic analyses commonly assume that the species history can be represented as a tree. However, in the presence of hybridization, the species history is more accurately captured as a network. Despite several advances in modeling phylogenetic networks, there is no known polynomial-time algorithm for parsimoniously reconciling gene trees with species networks while accounting for incomplete lineage sorting. To address this issue, we present a polynomial-time algorithm for the case of level-1 networks, in which no hybrid species is the direct ancestor of another hybrid species. This work enables more efficient reconciliation of gene trees with species networks, which in turn, enables more efficient reconstruction of species networks.

Download Full-text

CONSTRUCTING A MINIMUM PHYLOGENETIC NETWORK FROM A DENSE TRIPLET SET

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720012500138 ◽

2012 ◽

Vol 10 (05) ◽

pp. 1250013 ◽

Cited By ~ 8

Author(s):

MICHEL HABIB ◽

THU-HIEN TO

Keyword(s):

Polynomial Time ◽

Phylogenetic Network ◽

Polynomial Time Algorithm ◽

Time Algorithm ◽

Complete Answer ◽

Biconnected Components ◽

Polynomial Time Algorithms ◽

Minimum Number

For a given set [Formula: see text] of species and a set [Formula: see text] of triplets on [Formula: see text], we seek to construct a phylogenetic network which is consistent with [Formula: see text] i.e. which represents all triplets of [Formula: see text]. The level of a network is defined as the maximum number of hybrid vertices in its biconnected components. When [Formula: see text] is dense, there exist polynomial time algorithms to construct level-0,1 and 2 networks (Aho et al., 1981; Jansson, Nguyen and Sung, 2006; Jansson and Sung, 2006; Iersel et al., 2009). For higher levels, partial answers were obtained in the paper by Iersel and Kelk (2008), with a polynomial time algorithm for simple networks. In this paper, we detail the first complete answer for the general case, solving a problem proposed in Jansson and Sung (2006) and Iersel et al. (2009). For any k fixed, it is possible to construct a level-k network having the minimum number of hybrid vertices and consistent with [Formula: see text], if there is any, in time [Formula: see text].

Download Full-text

Learning Importance of Preferences

10.29007/v68w ◽

2018 ◽

Author(s):

Ying Zhu ◽

Mirek Truszczynski

Keyword(s):

Polynomial Time ◽

Polynomial Time Algorithm ◽

Time Algorithm ◽

Scoring Rules ◽

Np Hard ◽

Individual Preferences

We study the problem of learning the importance of preferences in preference profiles in two important cases: when individual preferences are aggregated by the ranked Pareto rule, and when they are aggregated by positional scoring rules. For the ranked Pareto rule, we provide a polynomial-time algorithm that finds a ranking of preferences such that the ranked profile correctly decides all the examples, whenever such a ranking exists. We also show that the problem to learn a ranking maximizing the number of correctly decided examples (also under the ranked Pareto rule) is NP-hard. We obtain similar results for the case of weighted profiles when positional scoring rules are used for aggregation.

Download Full-text