sampling algorithms Latest Research Papers

Fully Gibbs Sampling Algorithms for Bayesian Variable Selection in Latent Regression Models

10.31234/osf.io/dfrxj ◽

2021 ◽

Author(s):

Kazuhiro Yamaguchi ◽

Jihong Zhang

Keyword(s):

Variable Selection ◽

Gibbs Sampling ◽

Real Data ◽

Bayesian Variable Selection ◽

Theory Model ◽

Double Exponential ◽

Sampling Algorithms ◽

Shrinkage Priors ◽

Mean Square Errors ◽

Latent Regression

This study proposed efficient Gibbs sampling algorithms for variable selection in a latent regression model under a unidimensional two-parameter logistic item response theory model. Three types of shrinkage priors were employed to obtain shrinkage estimates: double-exponential (i.e., Laplace), horseshoe, and horseshoe+ priors. These shrinkage priors were compared to a uniform prior case in both simulation and real data analysis. The simulation study revealed that two types of horseshoe priors had a smaller root mean square errors and shorter 95% credible interval lengths than double-exponential or uniform priors. In addition, the horseshoe prior+ was slightly more stable than the horseshoe prior. The real data example successfully proved the utility of horseshoe and horseshoe+ priors in selecting effective predictive covariates for math achievement. In the final section, we discuss the benefits and limitations of the three types of Bayesian variable selection methods.

Download Full-text

Space from entanglement: An information-geometric perspective

International Journal of Geometric Methods in Modern Physics ◽

10.1142/s0219887822500098 ◽

2021 ◽

Author(s):

Xiao-Kan Guo

Keyword(s):

Physical Space ◽

Information Geometry ◽

Local Information ◽

Projective Limit ◽

Statistical Sampling ◽

Classical Geometry ◽

Fisher Information Metric ◽

Sampling Algorithms ◽

Information Metric ◽

Entanglement Structure

In this paper, we study the construction of classical geometry from the quantum entanglement structure by using information geometry. In the information geometry of classical spacetime, the Fisher information metric is related to a blurred metric of a classical physical space. We first show that a local information metric can be obtained from the entanglement contour in a local subregion. This local information metric measures the fine structure of entanglement spectra inside the subregion, which suggests a quantum origin of the information-geometric blurred space. We study both the continuous and the classical limits of the quantum-originated blurred space by using the techniques from the statistical sampling algorithms, the sampling theory of spacetime and the projective limit. A scheme for going from a blurred space with quantum features to a classical geometry is also explored.

Download Full-text

Parallel power posterior analyses for fast computation of marginal likelihoods in phylogenetics

PeerJ ◽

10.7717/peerj.12438 ◽

2021 ◽

Vol 9 ◽

pp. e12438

Author(s):

Sebastian Höhna ◽

Michael J. Landis ◽

John P. Huelsenbeck

Keyword(s):

Open Source Software ◽

Marginal Likelihood ◽

Substantial Reduction ◽

Likelihood Estimation ◽

Marginal Likelihoods ◽

Parallelization Strategy ◽

Sampling Algorithms ◽

Primary Focus ◽

Bayesian Phylogenetic Inference ◽

Marginal Likelihood Estimation

In Bayesian phylogenetic inference, marginal likelihoods can be estimated using several different methods, including the path-sampling or stepping-stone-sampling algorithms. Both algorithms are computationally demanding because they require a series of power posterior Markov chain Monte Carlo (MCMC) simulations. Here we introduce a general parallelization strategy that distributes the power posterior MCMC simulations and the likelihood computations over available CPUs. Our parallelization strategy can easily be applied to any statistical model despite our primary focus on molecular substitution models in this study. Using two phylogenetic example datasets, we demonstrate that the runtime of the marginal likelihood estimation can be reduced significantly even if only two CPUs are available (an average performance increase of 1.96x). The performance increase is nearly linear with the number of available CPUs. We record a performance increase of 13.3x for cluster nodes with 16 CPUs, representing a substantial reduction to the runtime of marginal likelihood estimations. Hence, our parallelization strategy enables the estimation of marginal likelihoods to complete in a feasible amount of time which previously needed days, weeks or even months. The methods described here are implemented in our open-source software RevBayes which is available from http://www.RevBayes.com.

Download Full-text

Computational Docking Technique for Drug Discovery: A Review

Research Journal of Pharmacy and Technology ◽

10.52711/0974-360x.2021.00968 ◽

2021 ◽

pp. 5558-5562

Author(s):

Rakhi Mishra ◽

Prem Shankar Mishra ◽

Rupa Mazumder ◽

Avijit Mazumder ◽

Anurag Chaudhary

Keyword(s):

Molecular Docking ◽

Drug Discovery ◽

Scoring Functions ◽

Binding Affinities ◽

Computational Docking ◽

Drug Discovery And Development ◽

Sampling Algorithms ◽

Important Means ◽

The Cost ◽

Receptor Targets

Computational and experimental techniques are two complimentary approaches that have important roles in drug discovery and development. Earlier time and cost of bringing a new drug in market bears a question as it takes seven to twelve years and $ 1.2 billion are often cited. Furthermore, five out of forty thousand compounds tested in animals reach human testing and only one of five compounds reaching clinical studies is approved. This accounts for a large input in terms of time, money and human and other resources. Therefore, new approaches are needed to facilitate, expedite and streamline drug discovery and development, save time, money and resources. Among many computational tools, molecular docking is one of the important means that can be used in drug discovery. It provides the information regarding the binding affinities between small molecules (ligands) and macromolecular receptor targets (proteins). Various approaches, methodology are cited in various literatures for describing the cost, time effect with success of drug discovery task. In this review, introduction of the available molecular docking methods, with simple methodology of docking and examples of drug design and discovery through computational docking methods is discussed and emphasis is made on various examples of sampling algorithms, scoring functions with their relevant characterstics with summary on type of ligand binding with receptors.

Download Full-text

Predictive Sampling Method for Spread Models in Networks

UF Journal of Undergraduate Research ◽

10.32473/ufjur.v23i.128429 ◽

2021 ◽

Vol 23 ◽

Author(s):

Caijun Qin

Keyword(s):

Network Architecture ◽

Original Network ◽

Network Sampling ◽

Initial Node ◽

Spread Model ◽

Multiple Trials ◽

Sampling Algorithms ◽

Induced Graph ◽

Effective Use ◽

High Degree

This paper proposes a novel, exploration-based network sampling algorithm called caterpillar quota walk sampling (CQWS) inspired by the caterpillar tree. Network sampling identifies a subset of nodes and edges from a network, creating an induced graph. Beginning from an initial node, exploration-based sampling algorithms grow the induced set by traversing and tracking unvisited neighboring nodes from the original network. Tunable and trainable parameters allow CQWS to maximize the sum of the degrees of the induced graph from multiple trials when sampling dense networks. A network spread model renders effective use in various applications, including tracking the spread of epidemics, visualizing information transmissions through social media, and cell-to-cell spread of neurodegenerative diseases. CQWS generates a spread model as its sample by visiting the highest-degree neighbors of previously visited nodes. For each previously visited node, a top proportion of the highest-degree neighbors fulfills a quota and branches into a new caterpillar tree. Sampling more high-degree nodes constitutes an objective among various applications. Many exploration-based sampling algorithms suffer drawbacks that limit the sum of degrees of visited nodes and thus the number of high-degree nodes visited. Furthermore, a strategy may not be adaptable to volatile degree frequencies throughout the original network architecture, which influences how deep into the original network an algorithm could sample. This paper analyzes CQWS in comparison to four other exploration-based network in tackling these two problems by sampling sparse and dense randomly generated networks.

Download Full-text

A novel imbalanced data classification approach using both under and over sampling

Bulletin of Electrical Engineering and Informatics ◽

10.11591/eei.v10i5.2785 ◽

2021 ◽

Vol 10 (5) ◽

pp. 2789-2795

Author(s):

Seyyed Mohammad Javadi Moghaddam ◽

Asadollah Noroozi

Keyword(s):

Sampling Methods ◽

Data Distribution ◽

Imbalanced Data ◽

Data Classification ◽

Processing Technique ◽

Imbalanced Dataset ◽

Minority Class ◽

Imbalanced Data Classification ◽

Under Sampling ◽

Sampling Algorithms

The performance of the data classification has encountered a problem when the data distribution is imbalanced. This fact results in the classifiers tend to the majority class which has the most of the instances. One of the popular approaches is to balance the dataset using over and under sampling methods. This paper presents a novel pre-processing technique that performs both over and under sampling algorithms for an imbalanced dataset. The proposed method uses the SMOTE algorithm to increase the minority class. Moreover, a cluster-based approach is performed to decrease the majority class which takes into consideration the new size of the minority class. The experimental results on 10 imbalanced datasets show the suggested algorithm has better performance in comparison to previous approaches.

Download Full-text

Probabilistically sampled and spectrally clustered plant species using phenotypic characteristics

PeerJ ◽

10.7717/peerj.11927 ◽

2021 ◽

Vol 9 ◽

pp. e11927

Author(s):

Aditya A. Shastri ◽

Kapil Ahuja ◽

Milind B. Ratnaparkhe ◽

Yann Busnel

Keyword(s):

Computational Complexity ◽

Plant Species ◽

Clustering Algorithm ◽

Sampling Technique ◽

Phenotypic Characteristics ◽

Current Standard ◽

Phenotypic Data ◽

Breeding Programs ◽

Sampling Algorithm ◽

Sampling Algorithms

Phenotypic characteristics of a plant species refers to its physical properties as cataloged by plant biologists at different research centers around the world. Clustering species based upon their phenotypic characteristics is used to obtain diverse sets of parents that are useful in their breeding programs. The Hierarchical Clustering (HC) algorithm is the current standard in clustering of phenotypic data. This algorithm suffers from low accuracy and high computational complexity issues. To address the accuracy challenge, we propose the use of Spectral Clustering (SC) algorithm. To make the algorithm computationally cheap, we propose using sampling, specifically, Pivotal Sampling that is probability based. Since application of samplings to phenotypic data has not been explored much, for effective comparison, another sampling technique called Vector Quantization (VQ) is adapted for this data as well. VQ has recently generated promising results for genotypic data. The novelty of our SC with Pivotal Sampling algorithm is in constructing the crucial similarity matrix for the clustering algorithm and defining probabilities for the sampling technique. Although our algorithm can be applied to any plant species, we tested it on the phenotypic data obtained from about 2,400 Soybean species. SC with Pivotal Sampling achieves substantially more accuracy (in terms of Silhouette Values) than all the other proposed competitive clustering with sampling algorithms (i.e. SC with VQ, HC with Pivotal Sampling, and HC with VQ). The complexities of our SC with Pivotal Sampling algorithm and these three variants are almost the same because of the involved sampling. In addition to this, SC with Pivotal Sampling outperforms the standard HC algorithm in both accuracy and computational complexity. We experimentally show that we are up to 45% more accurate than HC in terms of clustering accuracy. The computational complexity of our algorithm is more than a magnitude less than that of HC.

Download Full-text

Quantum Sampling Algorithms for Near-Term Devices

Physical Review Letters ◽

10.1103/physrevlett.127.100504 ◽

2021 ◽

Vol 127 (10) ◽

Cited By ~ 1

Author(s):

Dominik S. Wild ◽

Dries Sels ◽

Hannes Pichler ◽

Cristian Zanoci ◽

Mikhail D. Lukin

Keyword(s):

Sampling Algorithms ◽

Near Term

Download Full-text

Quantum sampling algorithms, phase transitions, and computational complexity

Physical Review A ◽

10.1103/physreva.104.032602 ◽

2021 ◽

Vol 104 (3) ◽

Author(s):

Dominik S. Wild ◽

Dries Sels ◽

Hannes Pichler ◽

Cristian Zanoci ◽

Mikhail D. Lukin

Keyword(s):

Phase Transitions ◽

Computational Complexity ◽

Sampling Algorithms

Download Full-text

Generalized parallel tempering on Bayesian inverse problems

Statistics and Computing ◽

10.1007/s11222-021-10042-6 ◽

2021 ◽

Vol 31 (5) ◽

Author(s):

Jonas Latz ◽

Juan P. Madrigal-Cianci ◽

Fabio Nobile ◽

Raúl Tempone

Keyword(s):

Inverse Problems ◽

Continuous Time ◽

Chem Phys ◽

Parallel Tempering ◽

Sampling Efficiency ◽

Discrete Time Markov Chain ◽

Bayesian Inverse Problems ◽

State Dependent ◽

Random Walk Metropolis ◽

Sampling Algorithms

AbstractIn the current work we present two generalizations of the Parallel Tempering algorithm in the context of discrete-time Markov chain Monte Carlo methods for Bayesian inverse problems. These generalizations use state-dependent swapping rates, inspired by the so-called continuous time Infinite Swapping algorithm presented in Plattner et al. (J Chem Phys 135(13):134111, 2011). We analyze the reversibility and ergodicity properties of our generalized PT algorithms. Numerical results on sampling from different target distributions, show that the proposed methods significantly improve sampling efficiency over more traditional sampling algorithms such as Random Walk Metropolis, preconditioned Crank–Nicolson, and (standard) Parallel Tempering.

Download Full-text

sampling algorithms
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Fully Gibbs Sampling Algorithms for Bayesian Variable Selection in Latent Regression Models

Space from entanglement: An information-geometric perspective

Parallel power posterior analyses for fast computation of marginal likelihoods in phylogenetics

Computational Docking Technique for Drug Discovery: A Review

Predictive Sampling Method for Spread Models in Networks

A novel imbalanced data classification approach using both under and over sampling

Probabilistically sampled and spectrally clustered plant species using phenotypic characteristics

Quantum Sampling Algorithms for Near-Term Devices

Quantum sampling algorithms, phase transitions, and computational complexity

Generalized parallel tempering on Bayesian inverse problems

Export Citation Format

sampling algorithmsRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Fully Gibbs Sampling Algorithms for Bayesian Variable Selection in Latent Regression Models

Space from entanglement: An information-geometric perspective

Parallel power posterior analyses for fast computation of marginal likelihoods in phylogenetics

Computational Docking Technique for Drug Discovery: A Review

Predictive Sampling Method for Spread Models in Networks

A novel imbalanced data classification approach using both under and over sampling

Probabilistically sampled and spectrally clustered plant species using phenotypic characteristics

Quantum Sampling Algorithms for Near-Term Devices

Quantum sampling algorithms, phase transitions, and computational complexity

Generalized parallel tempering on Bayesian inverse problems

sampling algorithms
Recently Published Documents