scholarly journals Maximum likelihood reconstruction of ancestral networks by integer linear programming

Author(s):  
Vaibhav Rajan ◽  
Ziqi Zhang ◽  
Carl Kingsford ◽  
Xiuwei Zhang

Abstract Motivation The study of the evolutionary history of biological networks enables deep functional understanding of various bio-molecular processes. Network growth models, such as the Duplication–Mutation with Complementarity (DMC) model, provide a principled approach to characterizing the evolution of protein–protein interactions (PPIs) based on duplication and divergence. Current methods for model-based ancestral network reconstruction primarily use greedy heuristics and yield sub-optimal solutions. Results We present a new Integer Linear Programming (ILP) solution for maximum likelihood reconstruction of ancestral PPI networks using the DMC model. We prove the correctness of our solution that is designed to find the optimal solution. It can also use efficient heuristics from general-purpose ILP solvers to obtain multiple optimal and near-optimal solutions that may be useful in many applications. Experiments on synthetic data show that our ILP obtains solutions with higher likelihood than those from previous methods, and is robust to noise and model mismatch. We evaluate our algorithm on two real PPI networks, with proteins from the families of bZIP transcription factors and the Commander complex. On both the networks, solutions from our ILP have higher likelihood and are in better agreement with independent biological evidence from other studies. Availability and implementation A Python implementation is available at https://bitbucket.org/cdal/network-reconstruction. Contact [email protected] Supplementary information Supplementary data are available at Bioinformatics online.

2019 ◽  
Author(s):  
Vaibhav Rajan ◽  
Carl Kingsford ◽  
Xiuwei Zhang

AbstractMotivationThe study of the evolutionary history of biological networks enables deep functional understanding of various bio-molecular processes. Network growth models, such as the Duplication-Mutation with Complementarity (DMC) model, provide a principled approach to characterizing the evolution of protein-protein interactions (PPI) based on duplication and divergence. Current methods for model-based ancestral network reconstruction primarily use greedy heuristics and yield sub-optimal solutions.ResultsWe present a new Integer Linear Programming (ILP) solution for maximum likelihood reconstruction of ancestral PPI networks using the DMC model. We prove the correctness of our solution that is designed to find the optimal solution. It can also use efficient heuristics from general-purpose ILP solvers to obtain multiple optimal and near-optimal solutions that may be useful in many applications. Experiments on synthetic data show that our ILP obtains solutions with higher likelihood than those from previous methods, and is robust to noise and model mismatch. We evaluate our algorithm on two real PPI networks, with proteins from the families of bZIP transcription factors and the Commander complex. On both the networks, solutions from our ILP have higher likelihood and are in better agreement with independent biological evidence from other studies.AvailabilityA Python implementation is available at https://bitbucket.org/cdal/[email protected]


2020 ◽  
Vol 61 (5) ◽  
pp. 1977-1999
Author(s):  
H. Fairclough ◽  
M. Gilbert

AbstractTraditional truss layout optimization employing the ground structure method will often generate layouts that are too complex to fabricate in practice. To address this, mixed integer linear programming can be used to enforce buildability constraints, leading to simplified truss forms. Limits on the number of joints in the structure and/or the minimum angle between connected members can be imposed, with the joints arising from crossover of pairs of members accounted for. However, in layout optimization, the number of constraints arising from ‘crossover joints’ increases rapidly with problem size, along with computational expense. To address this, crossover constraints are here dynamically generated and added at runtime only as required (so-called lazy constraints); speedups of more than 20 times are observed whilst ensuring that there is no loss of solution quality. Also, results from the layout optimization step are shown to provide a suitable starting point for a non-linear geometry optimization step, enabling results to be obtained that are in agreement with literature solutions. It is also shown that symmetric problems may not have symmetric optimal solutions, and that multiple distinct and equally optimal solutions may be found.


Author(s):  
Meet Barot ◽  
Vladimir Gligorijević ◽  
Kyunghyun Cho ◽  
Richard Bonneau

Abstract Motivation Transferring knowledge between species is challenging: different species contain distinct proteomes and cellular architectures, which cause their proteins to carry out different functions via different interaction networks. Many approaches to protein functional annotation use sequence similarity to transfer knowledge between species. These approaches cannot produce accurate predictions for proteins without homologues of known function, as many functions require cellular context for meaningful prediction. To supply this context, network-based methods use protein-protein interaction (PPI) networks as a source of information for inferring protein function and have demonstrated promising results in function prediction. However, most of these methods are tied to a network for a single species, and many species lack biological networks. Results In this work, we integrate sequence and network information across multiple species by computing IsoRank similarity scores to create a meta-network profile of the proteins of multiple species. We use this integrated multispecies meta-network as input to train a maxout neural network with Gene Ontology terms as target labels. Our multispecies approach takes advantage of more training examples, and consequently leads to significant improvements in function prediction performance compared to two network-based methods, a deep learning sequence-based method, and the BLAST annotation method used in the Critial Assessment of Functional Annotation. We are able to demonstrate that our approach performs well even in cases where a species has no network information available: when an organism’s PPI network is left out we can use our multi-species method to make predictions for the left-out organism with good performance. Availability The code is freely available at https://github.com/nowittynamesleft/NetQuilt Supplementary information Supplementary data are available at Bioinformatics online.


2021 ◽  
Author(s):  
Anbang Liu ◽  
Peter Luh ◽  
Bing Yan ◽  
Mikhail Bragin

<a></a>Job-shop scheduling is an important but difficult problem arising in low-volume high-variety manufacturing. It is usually solved at the beginning of each shift with strict computational time requirements. To obtain near-optimal solutions with quantifiable quality within strict time limits, a direction is to formulate them in an Integer Linear Programming (ILP) form so as to take advantages of widely available ILP methods such as Branch-and-Cut (B&C). Nevertheless, computational requirements for ILP methods on existing ILP formulations are high. In this paper, a novel ILP formulation for minimizing total weighted tardiness is presented. The new formulation has much fewer decision variables and constraints, and is proven to be tighter as compared to our previous formulation. For fast resolution of large problems, our recent decomposition-and-coordination method “Surrogate Absolute-Value Lagrangian Relaxation” (SAVLR) is enhanced by using a 3-segment piecewise linear penalty function, which more accurately approximates a quadratic penalty function as compared to an absolute-value function. Testing results demonstrate that our new formulation drastically reduces the computational requirements of B&C as compared to our previous formulation. For large problems where B&C has difficulties, near-optimal solutions are efficiently obtained by using the enhanced SAVLR under the new formulation.<br>


2021 ◽  
Author(s):  
Anbang Liu ◽  
Peter Luh ◽  
Bing Yan ◽  
Mikhail Bragin

<a></a>Job-shop scheduling is an important but difficult problem arising in low-volume high-variety manufacturing. It is usually solved at the beginning of each shift with strict computational time requirements. To obtain near-optimal solutions with quantifiable quality within strict time limits, a direction is to formulate them in an Integer Linear Programming (ILP) form so as to take advantages of widely available ILP methods such as Branch-and-Cut (B&C). Nevertheless, computational requirements for ILP methods on existing ILP formulations are high. In this paper, a novel ILP formulation for minimizing total weighted tardiness is presented. The new formulation has much fewer decision variables and constraints, and is proven to be tighter as compared to our previous formulation. For fast resolution of large problems, our recent decomposition-and-coordination method “Surrogate Absolute-Value Lagrangian Relaxation” (SAVLR) is enhanced by using a 3-segment piecewise linear penalty function, which more accurately approximates a quadratic penalty function as compared to an absolute-value function. Testing results demonstrate that our new formulation drastically reduces the computational requirements of B&C as compared to our previous formulation. For large problems where B&C has difficulties, near-optimal solutions are efficiently obtained by using the enhanced SAVLR under the new formulation.<br>


10.29007/sghd ◽  
2018 ◽  
Author(s):  
James Cussens

Pedigrees are `family trees' relating groups of individuals which can usefully be seen as Bayesian networks. The problem of finding a maximum likelihood pedigree from genotypic data is encoded as an integer linear programming problem. Two methods of ensuring that pedigrees are acyclic are considered. Results on obtaining maximum likelihood pedigrees relating 20, 46 and 59 individuals are presented. Running times for larger pedigrees depend strongly on the data used but generally compare well with those in the literature. Solving is particularly fast when allele frequency is uniform.


Author(s):  
Mengzhou Li ◽  
Sujoy Sikdar ◽  
Lirong Xia ◽  
Ge Wang

Cheating prevention in online exams is often hard and costly to tackle with proctoring, and it even sometimes involves privacy issues, especially in social distancing due to the pandemic of COVID-19. Here we propose a low-cost and privacy-preserving anti-cheating scheme by programmatically minimizing the cheating gain. A novel anti-cheating scheme we developed theoretically ensures that the cheating gain of all students can be controlled below a desired level aided by the prior knowledge of students&rsquo; abilities and a proper assignment of question sequences. Furthermore, a heuristic greedy algorithm we developed can refine an assignment of questions from a cyclic pool of question sequences to efficiently reduce the cheating gain. Compared to the integer linear programming and min-max matching methods in a small-scale simulation, our heuristic algorithm provides results close to the optimal solutions offered by the two standard discrete optimization methods. Hence, our anti-cheating approach could potentially be a cost-effective solution to the well-known cheating problem even without proctoring.


2012 ◽  
Vol 37 (1) ◽  
pp. 69-83 ◽  
Author(s):  
James Cussens ◽  
Mark Bartlett ◽  
Elinor M. Jones ◽  
Nuala A. Sheehan

Sign in / Sign up

Export Citation Format

Share Document