scholarly journals Model checking software for phylogenetic trees using distribution and database methods

2013 ◽  
Vol 10 (3) ◽  
pp. 16-30 ◽  
Author(s):  
José Ignacio Requeno ◽  
José Manuel Colom

Summary Model checking, a generic and formal paradigm stemming from computer science based on temporal logics, has been proposed for the study of biological properties that emerge from the labeling of the states defined over the phylogenetic tree. This strategy allows us to use generic software tools already present in the industry. However, the performance of traditional model checking is penalized when scaling the system for large phylogenies. To this end, two strategies are presented here. The first one consists of partitioning the phylogenetic tree into a set of subgraphs each one representing a subproblem to be verified so as to speed up the computation time and distribute the memory consumption. The second strategy is based on uncoupling the information associated to each state of the phylogenetic tree (mainly, the DNA sequence) and exporting it to an external tool for the management of large information systems. The integration of all these approaches outperforms the results of monolithic model checking and helps us to execute the verification of properties in a real phylogenetic tree.

2021 ◽  
Vol 12 (3) ◽  
pp. 39-52
Author(s):  
Martha Ximena Torres Delgado

Phylogenetics determines the evolutionary relationships between groups of species, through a phylogenetic tree. PhyML is among the main programs for the reconstruction of phylogenetic trees. Bootstrap is a statistical method used to measure the confidence of a given data set, which is usually applied in the analysis of inferred phylogenetic trees. In PhyML this method has two MPI parallel implementations: with point-to-point operations and collective operations. The second version is more efficient than the first, however it has a limitation on the number of bootstrap to be used due to the increase in memory consumption. In order to solve this problem, three proposals were developed. The objectives of this work were to carry out the validation of these versions together with performance tests. The validation showed that the proposed solutions present results equivalent to the point-to-point version. In the performance simulations, two solutions were shown to be superior to the point-to-point version, with the best one achieving gains of 28.46% and 39.64% for 32 and 64 processes, respectively. Therefore, the enhancements allow alternatives to the point-to-point version without limitingmemory.


2021 ◽  
Vol 82 (1-2) ◽  
Author(s):  
Lena Collienne ◽  
Alex Gavryushkin

AbstractMany popular algorithms for searching the space of leaf-labelled (phylogenetic) trees are based on tree rearrangement operations. Under any such operation, the problem is reduced to searching a graph where vertices are trees and (undirected) edges are given by pairs of trees connected by one rearrangement operation (sometimes called a move). Most popular are the classical nearest neighbour interchange, subtree prune and regraft, and tree bisection and reconnection moves. The problem of computing distances, however, is $${\mathbf {N}}{\mathbf {P}}$$ N P -hard in each of these graphs, making tree inference and comparison algorithms challenging to design in practice. Although anked phylogenetic trees are one of the central objects of interest in applications such as cancer research, immunology, and epidemiology, the computational complexity of the shortest path problem for these trees remained unsolved for decades. In this paper, we settle this problem for the ranked nearest neighbour interchange operation by establishing that the complexity depends on the weight difference between the two types of tree rearrangements (rank moves and edge moves), and varies from quadratic, which is the lowest possible complexity for this problem, to $${\mathbf {N}}{\mathbf {P}}$$ N P -hard, which is the highest. In particular, our result provides the first example of a phylogenetic tree rearrangement operation for which shortest paths, and hence the distance, can be computed efficiently. Specifically, our algorithm scales to trees with tens of thousands of leaves (and likely hundreds of thousands if implemented efficiently).


1980 ◽  
Vol 187 (1) ◽  
pp. 65-74 ◽  
Author(s):  
D Penny ◽  
M D Hendy ◽  
L R Foulds

We have recently reported a method to identify the shortest possible phylogenetic tree for a set of protein sequences [Foulds Hendy & Penny (1979) J. Mol. Evol. 13. 127–150; Foulds, Penny & Hendy (1979) J. Mol. Evol. 13, 151–166]. The present paper discusses issues that arise during the construction of minimal phylogenetic trees from protein-sequence data. The conversion of the data from amino acid sequences into nucleotide sequences is shown to be advantageous. A new variation of a method for constructing a minimal tree is presented. Our previous methods have involved first constructing a tree and then either proving that it is minimal or transforming it into a minimal tree. The approach presented in the present paper progressively builds up a tree, taxon by taxon. We illustrate this approach by using it to construct a minimal tree for ten mammalian haemoglobin alpha-chain sequences. Finally we define a measure of the complexity of the data and illustrate a method to derive a directed phylogenetic tree from the minimal tree.


Jurnal INKOM ◽  
2014 ◽  
Vol 8 (1) ◽  
pp. 29 ◽  
Author(s):  
Arnida Lailatul Latifah ◽  
Adi Nurhadiyatna

This paper proposes parallel algorithms for precipitation of flood modelling, especially applied in spatial rainfall distribution. As an important input in flood modelling, spatial distribution of rainfall is always needed as a pre-conditioned model. In this paper two interpolation methods, Inverse distance weighting (IDW) and Ordinary kriging (OK) are discussed. Both are developed in parallel algorithms in order to reduce the computational time. To measure the computation efficiency, the performance of the parallel algorithms are compared to the serial algorithms for both methods. Findings indicate that: (1) the computation time of OK algorithm is up to 23% longer than IDW; (2) the computation time of OK and IDW algorithms is linearly increasing with the number of cells/ points; (3) the computation time of the parallel algorithms for both methods is exponentially decaying with the number of processors. The parallel algorithm of IDW gives a decay factor of 0.52, while OK gives 0.53; (4) The parallel algorithms perform near ideal speed-up.


Quantum ◽  
2021 ◽  
Vol 5 ◽  
pp. 410
Author(s):  
Johnnie Gray ◽  
Stefanos Kourtis

Tensor networks represent the state-of-the-art in computational methods across many disciplines, including the classical simulation of quantum many-body systems and quantum circuits. Several applications of current interest give rise to tensor networks with irregular geometries. Finding the best possible contraction path for such networks is a central problem, with an exponential effect on computation time and memory footprint. In this work, we implement new randomized protocols that find very high quality contraction paths for arbitrary and large tensor networks. We test our methods on a variety of benchmarks, including the random quantum circuit instances recently implemented on Google quantum chips. We find that the paths obtained can be very close to optimal, and often many orders or magnitude better than the most established approaches. As different underlying geometries suit different methods, we also introduce a hyper-optimization approach, where both the method applied and its algorithmic parameters are tuned during the path finding. The increase in quality of contraction schemes found has significant practical implications for the simulation of quantum many-body systems and particularly for the benchmarking of new quantum chips. Concretely, we estimate a speed-up of over 10,000× compared to the original expectation for the classical simulation of the Sycamore `supremacy' circuits.


Author(s):  
Mochammad Rajasa Mukti Negara ◽  
Ita Krissanti ◽  
Gita Widya Pradini

BACKGROUND Nucleocapsid (N) protein is one of four structural proteins of SARS-CoV-2  which is known to be more conserved than spike protein and is highly immunogenic. This study aimed to analyze the variation of the SARS-CoV-2 N protein sequences in ASEAN countries, including Indonesia. METHODS Complete sequences of SARS-CoV-2 N protein from each ASEAN country were obtained from Global Initiative on Sharing All Influenza Data (GISAID), while the reference sequence was obtained from GenBank. All sequences collected from December 2019 to March 2021 were grouped to the clade according to GISAID, and two representative isolates were chosen from each clade for the analysis. The sequences were aligned by MUSCLE, and phylogenetic trees were built using MEGA-X software based on the nucleotide and translated AA sequences. RESULTS 98 isolates of complete N protein genes from ASEAN countries were analyzed. The nucleotides of all isolates were 97.5% conserved. Of 31 nucleotide changes, 22 led to amino acid (AA) substitutions; thus, the AA sequences were 94.5% conserved. The phylogenetic tree of nucleotide and AA sequences shows similar branches. Nucleotide variations in clade O (C28311T); clade GR (28881–28883 GGG>AAC); and clade GRY (28881–28883 GGG>AAC and C28977T) lead to specific branches corresponding to the clade within both trees. CONCLUSIONS The N protein sequences of SARS-CoV-2 across ASEAN countries are highly conserved. Most isolates were closely related to the reference sequence originating from China, except the isolates representing clade O, GR, and GRY which formed specific branches in the phylogenetic tree.


Phytotaxa ◽  
2016 ◽  
Vol 253 (3) ◽  
pp. 179 ◽  
Author(s):  
DAN ZHU ◽  
ZONG-LONG LUO ◽  
DARBHE JAYARAMA BAHT ◽  
ERIC.H.C. MCKENZIE ◽  
ALI H. BAHKALI ◽  
...  

Helminthosporium species from submerged wood in streams in Yunnan Province, China were studied based on morphology and DNA sequence data. Descriptions and illustrations of Helminthosporium velutinum and a new species H. aquaticum are provided. A combined phylogenetic tree, based on SSU, ITS and LSU sequence data, place the species in Massarinaceae, Pleosporales. The polyphyletic nature of Helminthosporium species within Massarinaceae is shown based on ITS sequence data available in GenBank.


Author(s):  
Gianpiero Cabodi ◽  
Paolo Camurati ◽  
Marco Palena ◽  
Paolo Pasini ◽  
Danilo Vendraminetto

Author(s):  
Ning Yang ◽  
Shiaaulir Wang ◽  
Paul Schonfeld

A Parallel Genetic Algorithm (PGA) is used for a simulation-based optimization of waterway project schedules. This PGA is designed to distribute a Genetic Algorithm application over multiple processors in order to speed up the solution search procedure for a very large combinational problem. The proposed PGA is based on a global parallel model, which is also called a master-slave model. A Message-Passing Interface (MPI) is used in developing the parallel computing program. A case study is presented, whose results show how the adaption of a simulation-based optimization algorithm to parallel computing can greatly reduce computation time. Additional techniques which are found to further improve the PGA performance include: (1) choosing an appropriate task distribution method, (2) distributing simulation replications instead of different solutions, (3) avoiding the simulation of duplicate solutions, (4) avoiding running multiple simulations simultaneously in shared-memory processors, and (5) avoiding using multiple processors which belong to different clusters (physical sub-networks).


Sign in / Sign up

Export Citation Format

Share Document