scholarly journals KwARG: Parsimonious Reconstruction of Ancestral Recombination Graphs with Recurrent Mutation

Author(s):  
Anastasia Ignatieva ◽  
Rune B. Lyngsø ◽  
Paul A. Jenkins ◽  
Jotun Hein

AbstractThe reconstruction of possible histories given a sample of genetic data in the presence of recombination and recurrent mutation is a challenging problem, but can provide key insights into the evolution of a population. We present KwARG, which implements a parsimony-based greedy heuristic algorithm for finding plausible genealogical histories (ancestral recombination graphs) that are minimal or near-minimal in the number of posited recombination and mutation events. Given an input dataset of aligned sequences, KwARG outputs a list of possible candidate solutions, each comprising a list of mutation and recombination events that could have generated the dataset; the relative proportion of recombinations and recurrent mutations in a solution can be controlled via specifying a set of ‘cost’ parameters. We demonstrate that the algorithm performs well when compared against existing methods. The software is made available on GitHub.

2015 ◽  
Vol 2015 ◽  
pp. 1-22 ◽  
Author(s):  
Yu Lin ◽  
Zheyong Bian ◽  
Shujing Sun ◽  
Tianyi Xu

In recent years, logistics systems with multiple suppliers and plants in neighboring regions have been flourishing worldwide. However, high logistics costs remain a problem for such systems due to lack of information sharing and cooperation. This paper proposes an extended mathematical model that minimizes transportation and pipeline inventory costs via the many-to-many Milk-run routing mode. Because the problem is NP hard, a two-stage heuristic algorithm is developed by comprehensively considering its characteristics. More specifically, an initial satisfactory solution is generated in the first stage through a greedy heuristic algorithm to minimize the total number of vehicle service nodes and the best insertion heuristic algorithm to determine each vehicle’s route. Then, a simulated annealing algorithm (SA) with limited search scope is used to improve the initial satisfactory solution. Thirty numerical examples are employed to test the proposed algorithms. The experiment results demonstrate the effectiveness of this algorithm. Further, the superiority of the many-to-many transportation mode over other modes is demonstrated via two case studies.


2015 ◽  
Vol 52 (02) ◽  
pp. 519-537 ◽  
Author(s):  
Jere Koskela ◽  
Paul Jenkins ◽  
Dario Spanò

Full likelihood inference under Kingman's coalescent is a computationally challenging problem to which importance sampling (IS) and the product of approximate conditionals (PAC) methods have been applied successfully. Both methods can be expressed in terms of families of intractable conditional sampling distributions (CSDs), and rely on principled approximations for accurate inference. Recently, more general Λ- and Ξ-coalescents have been observed to provide better modelling fits to some genetic data sets. We derive families of approximate CSDs for finite sites Λ- and Ξ-coalescents, and use them to obtain ‘approximately optimal’ IS and PAC algorithms for Λ-coalescents, yielding substantial gains in efficiency over existing methods.


2015 ◽  
Vol 2015 ◽  
pp. 1-13 ◽  
Author(s):  
Yu Lin ◽  
Tianyi Xu ◽  
Zheyong Bian

High frequency and small lot size are characteristics of milk runs and are often used to implement the just-in-time (JIT) strategy in logistical systems. The common frequency problem, which simultaneously involves planning of the route and frequency, has been extensively researched in milk run systems. In addition, vehicle type choice in the milk run system also has a significant influence on the operating cost. Therefore, in this paper, we simultaneously consider vehicle routing planning, frequency planning, and vehicle type choice in order to optimize the sum of the cost of transportation, inventory, and dispatch. To this end, we develop a mathematical model to describe the common frequency problem with vehicle type choice. Since the problem is NP hard, we develop a two-phase heuristic algorithm to solve the model. More specifically, an initial satisfactory solution is first generated through a greedy heuristic algorithm to maximize the ratio of the superior arc frequency to the inferior arc frequency. Following this, a tabu search (TS) with limited search scope is used to improve the initial satisfactory solution. Numerical examples with different sizes establish the efficacy of our model and our proposed algorithm.


2015 ◽  
Vol 52 (2) ◽  
pp. 519-537 ◽  
Author(s):  
Jere Koskela ◽  
Paul Jenkins ◽  
Dario Spanò

Full likelihood inference under Kingman's coalescent is a computationally challenging problem to which importance sampling (IS) and the product of approximate conditionals (PAC) methods have been applied successfully. Both methods can be expressed in terms of families of intractable conditional sampling distributions (CSDs), and rely on principled approximations for accurate inference. Recently, more general Λ- and Ξ-coalescents have been observed to provide better modelling fits to some genetic data sets. We derive families of approximate CSDs for finite sites Λ- and Ξ-coalescents, and use them to obtain ‘approximately optimal’ IS and PAC algorithms for Λ-coalescents, yielding substantial gains in efficiency over existing methods.


2018 ◽  
Vol 2018 ◽  
pp. 1-11 ◽  
Author(s):  
Ben Quinton ◽  
Neda Aboutorab

Future distributed data networks are expected to be assisted by users cooperation and coding schemes. Given the explosive increase in the end-users’ demand for download of the content from the servers, in this paper, the implementation of instantly decodable network coding (IDNC) is considered in full-duplex device-to-device (D2D) cooperative fog data networks. In particular, this paper is concerned with designing efficient transmission schemes to offload traffic from the expensive backhaul of network servers by employing IDNC and users cooperation. The generalized framework where users send request for multiple packets and the transmissions are subject to erasure is considered. The optimal problem formulation is presented using the stochastic shortest path (SSP) technique over the IDNC graph with induced subgraphs. However, as the optimal solution suffers from the intractability of being NP-hard, it is not suitable for real-time communications. The complexity of the problem is addressed by presenting a greedy heuristic algorithm used over the proposed graph model. The paper shows that by implementing IDNC in a full-duplex cooperative D2D network model significant reduction in the number of downloads required from the servers can be achieved, which will result in offloading of the backhaul servers and thus saving valuable servers’ resources. It is also shown that the performance of the proposed heuristic algorithm is very close to the optimal solution with much lower computational complexity.


2021 ◽  
Author(s):  
Elizabeth Hayman ◽  
Anastasia Ignatieva ◽  
Jotun Hein

Recombination is a powerful evolutionary process that shapes the genetic diversity observed in the populations of many species. Reconstructing genealogies in the presence of recombination from sequencing data is a very challenging problem, as this relies on mutations having occurred on the correct lineages in order to detect the recombination and resolve the placement of edges in the local trees. We investigate the probability of recovering the true topology of ancestral recombination graphs (ARGs) under the coalescent with recombination and gene conversion. We explore how sample size and mutation rate affect the inherent uncertainty in reconstructed ARGs; this sheds light on the theoretical limitations of ARG reconstruction methods. We illustrate our results using estimates of evolutionary rates for several biological organisms; in particular, we find that for parameter values that are realistic for SARS-CoV-2, the probability of reconstructing genealogies that are close to the truth is low.


Sign in / Sign up

Export Citation Format

Share Document