sampling algorithms
Recently Published Documents


TOTAL DOCUMENTS

252
(FIVE YEARS 76)

H-INDEX

26
(FIVE YEARS 3)

2021 ◽  
Author(s):  
Kazuhiro Yamaguchi ◽  
Jihong Zhang

This study proposed efficient Gibbs sampling algorithms for variable selection in a latent regression model under a unidimensional two-parameter logistic item response theory model. Three types of shrinkage priors were employed to obtain shrinkage estimates: double-exponential (i.e., Laplace), horseshoe, and horseshoe+ priors. These shrinkage priors were compared to a uniform prior case in both simulation and real data analysis. The simulation study revealed that two types of horseshoe priors had a smaller root mean square errors and shorter 95% credible interval lengths than double-exponential or uniform priors. In addition, the horseshoe prior+ was slightly more stable than the horseshoe prior. The real data example successfully proved the utility of horseshoe and horseshoe+ priors in selecting effective predictive covariates for math achievement. In the final section, we discuss the benefits and limitations of the three types of Bayesian variable selection methods.


Author(s):  
Xiao-Kan Guo

In this paper, we study the construction of classical geometry from the quantum entanglement structure by using information geometry. In the information geometry of classical spacetime, the Fisher information metric is related to a blurred metric of a classical physical space. We first show that a local information metric can be obtained from the entanglement contour in a local subregion. This local information metric measures the fine structure of entanglement spectra inside the subregion, which suggests a quantum origin of the information-geometric blurred space. We study both the continuous and the classical limits of the quantum-originated blurred space by using the techniques from the statistical sampling algorithms, the sampling theory of spacetime and the projective limit. A scheme for going from a blurred space with quantum features to a classical geometry is also explored.


PeerJ ◽  
2021 ◽  
Vol 9 ◽  
pp. e12438
Author(s):  
Sebastian Höhna ◽  
Michael J. Landis ◽  
John P. Huelsenbeck

In Bayesian phylogenetic inference, marginal likelihoods can be estimated using several different methods, including the path-sampling or stepping-stone-sampling algorithms. Both algorithms are computationally demanding because they require a series of power posterior Markov chain Monte Carlo (MCMC) simulations. Here we introduce a general parallelization strategy that distributes the power posterior MCMC simulations and the likelihood computations over available CPUs. Our parallelization strategy can easily be applied to any statistical model despite our primary focus on molecular substitution models in this study. Using two phylogenetic example datasets, we demonstrate that the runtime of the marginal likelihood estimation can be reduced significantly even if only two CPUs are available (an average performance increase of 1.96x). The performance increase is nearly linear with the number of available CPUs. We record a performance increase of 13.3x for cluster nodes with 16 CPUs, representing a substantial reduction to the runtime of marginal likelihood estimations. Hence, our parallelization strategy enables the estimation of marginal likelihoods to complete in a feasible amount of time which previously needed days, weeks or even months. The methods described here are implemented in our open-source software RevBayes which is available from http://www.RevBayes.com.


Author(s):  
Rakhi Mishra ◽  
Prem Shankar Mishra ◽  
Rupa Mazumder ◽  
Avijit Mazumder ◽  
Anurag Chaudhary

Computational and experimental techniques are two complimentary approaches that have important roles in drug discovery and development. Earlier time and cost of bringing a new drug in market bears a question as it takes seven to twelve years and $ 1.2 billion are often cited. Furthermore, five out of forty thousand compounds tested in animals reach human testing and only one of five compounds reaching clinical studies is approved. This accounts for a large input in terms of time, money and human and other resources. Therefore, new approaches are needed to facilitate, expedite and streamline drug discovery and development, save time, money and resources. Among many computational tools, molecular docking is one of the important means that can be used in drug discovery. It provides the information regarding the binding affinities between small molecules (ligands) and macromolecular receptor targets (proteins). Various approaches, methodology are cited in various literatures for describing the cost, time effect with success of drug discovery task. In this review, introduction of the available molecular docking methods, with simple methodology of docking and examples of drug design and discovery through computational docking methods is discussed and emphasis is made on various examples of sampling algorithms, scoring functions with their relevant characterstics with summary on type of ligand binding with receptors.


2021 ◽  
Vol 23 ◽  
Author(s):  
Caijun Qin

This paper proposes a novel, exploration-based network sampling algorithm called caterpillar quota walk sampling (CQWS) inspired by the caterpillar tree. Network sampling identifies a subset of nodes and edges from a network, creating an induced graph. Beginning from an initial node, exploration-based sampling algorithms grow the induced set by traversing and tracking unvisited neighboring nodes from the original network. Tunable and trainable parameters allow CQWS to maximize the sum of the degrees of the induced graph from multiple trials when sampling dense networks. A network spread model renders effective use in various applications, including tracking the spread of epidemics, visualizing information transmissions through social media, and cell-to-cell spread of neurodegenerative diseases. CQWS generates a spread model as its sample by visiting the highest-degree neighbors of previously visited nodes. For each previously visited node, a top proportion of the highest-degree neighbors fulfills a quota and branches into a new caterpillar tree. Sampling more high-degree nodes constitutes an objective among various applications. Many exploration-based sampling algorithms suffer drawbacks that limit the sum of degrees of visited nodes and thus the number of high-degree nodes visited. Furthermore, a strategy may not be adaptable to volatile degree frequencies throughout the original network architecture, which influences how deep into the original network an algorithm could sample. This paper analyzes CQWS in comparison to four other exploration-based network in tackling these two problems by sampling sparse and dense randomly generated networks.


2021 ◽  
Vol 10 (5) ◽  
pp. 2789-2795
Author(s):  
Seyyed Mohammad Javadi Moghaddam ◽  
Asadollah Noroozi

The performance of the data classification has encountered a problem when the data distribution is imbalanced. This fact results in the classifiers tend to the majority class which has the most of the instances. One of the popular approaches is to balance the dataset using over and under sampling methods. This paper presents a novel pre-processing technique that performs both over and under sampling algorithms for an imbalanced dataset. The proposed method uses the SMOTE algorithm to increase the minority class. Moreover, a cluster-based approach is performed to decrease the majority class which takes into consideration the new size of the minority class. The experimental results on 10 imbalanced datasets show the suggested algorithm has better performance in comparison to previous approaches.


PeerJ ◽  
2021 ◽  
Vol 9 ◽  
pp. e11927
Author(s):  
Aditya A. Shastri ◽  
Kapil Ahuja ◽  
Milind B. Ratnaparkhe ◽  
Yann Busnel

Phenotypic characteristics of a plant species refers to its physical properties as cataloged by plant biologists at different research centers around the world. Clustering species based upon their phenotypic characteristics is used to obtain diverse sets of parents that are useful in their breeding programs. The Hierarchical Clustering (HC) algorithm is the current standard in clustering of phenotypic data. This algorithm suffers from low accuracy and high computational complexity issues. To address the accuracy challenge, we propose the use of Spectral Clustering (SC) algorithm. To make the algorithm computationally cheap, we propose using sampling, specifically, Pivotal Sampling that is probability based. Since application of samplings to phenotypic data has not been explored much, for effective comparison, another sampling technique called Vector Quantization (VQ) is adapted for this data as well. VQ has recently generated promising results for genotypic data. The novelty of our SC with Pivotal Sampling algorithm is in constructing the crucial similarity matrix for the clustering algorithm and defining probabilities for the sampling technique. Although our algorithm can be applied to any plant species, we tested it on the phenotypic data obtained from about 2,400 Soybean species. SC with Pivotal Sampling achieves substantially more accuracy (in terms of Silhouette Values) than all the other proposed competitive clustering with sampling algorithms (i.e. SC with VQ, HC with Pivotal Sampling, and HC with VQ). The complexities of our SC with Pivotal Sampling algorithm and these three variants are almost the same because of the involved sampling. In addition to this, SC with Pivotal Sampling outperforms the standard HC algorithm in both accuracy and computational complexity. We experimentally show that we are up to 45% more accurate than HC in terms of clustering accuracy. The computational complexity of our algorithm is more than a magnitude less than that of HC.


2021 ◽  
Vol 127 (10) ◽  
Author(s):  
Dominik S. Wild ◽  
Dries Sels ◽  
Hannes Pichler ◽  
Cristian Zanoci ◽  
Mikhail D. Lukin

2021 ◽  
Vol 104 (3) ◽  
Author(s):  
Dominik S. Wild ◽  
Dries Sels ◽  
Hannes Pichler ◽  
Cristian Zanoci ◽  
Mikhail D. Lukin

2021 ◽  
Vol 31 (5) ◽  
Author(s):  
Jonas Latz ◽  
Juan P. Madrigal-Cianci ◽  
Fabio Nobile ◽  
Raúl Tempone

AbstractIn the current work we present two generalizations of the Parallel Tempering algorithm in the context of discrete-time Markov chain Monte Carlo methods for Bayesian inverse problems. These generalizations use state-dependent swapping rates, inspired by the so-called continuous time Infinite Swapping algorithm presented in Plattner et al. (J Chem Phys 135(13):134111, 2011). We analyze the reversibility and ergodicity properties of our generalized PT algorithms. Numerical results on sampling from different target distributions, show that the proposed methods significantly improve sampling efficiency over more traditional sampling algorithms such as Random Walk Metropolis, preconditioned Crank–Nicolson, and (standard) Parallel Tempering.


Sign in / Sign up

Export Citation Format

Share Document