scholarly journals Computing Simplicial Depth by Using Importance Sampling Algorithm and Its Application

2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Fanyu Meng ◽  
Wei Shao ◽  
Yuxia Su

Simplicial depth (SD) plays an important role in discriminant analysis, hypothesis testing, machine learning, and engineering computations. However, the computation of simplicial depth is hugely challenging because the exact algorithm is an NP problem with dimension d and sample size n as input arguments. The approximate algorithm for simplicial depth computation has extremely low efficiency, especially in high-dimensional cases. In this study, we design an importance sampling algorithm for the computation of simplicial depth. As an advanced Monte Carlo method, the proposed algorithm outperforms other approximate and exact algorithms in accuracy and efficiency, as shown by simulated and real data experiments. Furthermore, we illustrate the robustness of simplicial depth in regression analysis through a concrete physical data experiment.

2000 ◽  
Vol 13 ◽  
pp. 155-188 ◽  
Author(s):  
J. Cheng ◽  
M. J. Druzdzel

Stochastic sampling algorithms, while an attractive alternative to exact algorithms in very large Bayesian network models, have been observed to perform poorly in evidential reasoning with extremely unlikely evidence. To address this problem, we propose an adaptive importance sampling algorithm, AIS-BN, that shows promising convergence rates even under extreme conditions and seems to outperform the existing sampling algorithms consistently. Three sources of this performance improvement are (1) two heuristics for initialization of the importance function that are based on the theoretical properties of importance sampling in finite-dimensional integrals and the structural advantages of Bayesian networks, (2) a smooth learning method for the importance function, and (3) a dynamic weighting function for combining samples from different stages of the algorithm. We tested the performance of the AIS-BN algorithm along with two state of the art general purpose sampling algorithms, likelihood weighting (Fung & Chang, 1989; Shachter & Peot, 1989) and self-importance sampling (Shachter & Peot, 1989). We used in our tests three large real Bayesian network models available to the scientific community: the CPCS network (Pradhan et al., 1994), the PathFinder network (Heckerman, Horvitz, & Nathwani, 1990), and the ANDES network (Conati, Gertner, VanLehn, & Druzdzel, 1997), with evidence as unlikely as 10^-41. While the AIS-BN algorithm always performed better than the other two algorithms, in the majority of the test cases it achieved orders of magnitude improvement in precision of the results. Improvement in speed given a desired precision is even more dramatic, although we are unable to report numerical results here, as the other algorithms almost never achieved the precision reached even by the first few iterations of the AIS-BN algorithm.


2015 ◽  
Vol 6 (1) ◽  
pp. 35-46 ◽  
Author(s):  
Yong Wang

Traveling salesman problem (TSP) is a classic combinatorial optimization problem. The time complexity of the exact algorithms is generally an exponential function of the scale of TSP. This work gives an approximate algorithm with a four-vertex-three-line inequality for the triangle TSP. The time complexity is O(n2) and it can generate an approximation less than 2 times of the optimal solution. The paper designs a simple algorithm with the inequality. The algorithm is compared with the double-nearest neighbor algorithm. The experimental results illustrate the algorithm find the better approximations than the double-nearest neighbor algorithm for most TSP instances.


2019 ◽  
Vol 35 (14) ◽  
pp. i408-i416 ◽  
Author(s):  
Nuraini Aguse ◽  
Yuanyuan Qi ◽  
Mohammed El-Kebir

Abstract Motivation Cancer phylogenies are key to studying tumorigenesis and have clinical implications. Due to the heterogeneous nature of cancer and limitations in current sequencing technology, current cancer phylogeny inference methods identify a large solution space of plausible phylogenies. To facilitate further downstream analyses, methods that accurately summarize such a set T of cancer phylogenies are imperative. However, current summary methods are limited to a single consensus tree or graph and may miss important topological features that are present in different subsets of candidate trees. Results We introduce the Multiple Consensus Tree (MCT) problem to simultaneously cluster T and infer a consensus tree for each cluster. We show that MCT is NP-hard, and present an exact algorithm based on mixed integer linear programming (MILP). In addition, we introduce a heuristic algorithm that efficiently identifies high-quality consensus trees, recovering all optimal solutions identified by the MILP in simulated data at a fraction of the time. We demonstrate the applicability of our methods on both simulated and real data, showing that our approach selects the number of clusters depending on the complexity of the solution space T. Availability and implementation https://github.com/elkebir-group/MCT. Supplementary information Supplementary data are available at Bioinformatics online.


Sign in / Sign up

Export Citation Format

Share Document