scholarly journals Geometric Variational Inference

Entropy ◽  
2021 ◽  
Vol 23 (7) ◽  
pp. 853
Author(s):  
Philipp Frank ◽  
Reimar Leike ◽  
Torsten A. Enßlin

Efficiently accessing the information contained in non-linear and high dimensional probability distributions remains a core challenge in modern statistics. Traditionally, estimators that go beyond point estimates are either categorized as Variational Inference (VI) or Markov-Chain Monte-Carlo (MCMC) techniques. While MCMC methods that utilize the geometric properties of continuous probability distributions to increase their efficiency have been proposed, VI methods rarely use the geometry. This work aims to fill this gap and proposes geometric Variational Inference (geoVI), a method based on Riemannian geometry and the Fisher information metric. It is used to construct a coordinate transformation that relates the Riemannian manifold associated with the metric to Euclidean space. The distribution, expressed in the coordinate system induced by the transformation, takes a particularly simple form that allows for an accurate variational approximation by a normal distribution. Furthermore, the algorithmic structure allows for an efficient implementation of geoVI which is demonstrated on multiple examples, ranging from low-dimensional illustrative ones to non-linear, hierarchical Bayesian inverse problems in thousands of dimensions.

2019 ◽  
Author(s):  
Mathieu Fourment ◽  
Aaron E. Darling

AbstractRecent advances in statistical machine learning techniques have led to the creation of probabilistic programming frameworks. These frameworks enable probabilistic models to be rapidly prototyped and fit to data using scalable approximation methods such as variational inference. In this work, we explore the use of the Stan language for probabilistic programming in application to phylogenetic models. We show that many commonly used phylogenetic models including the general time reversible (GTR) substitution model, rate heterogeneity among sites, and a range of coalescent models can be implemented using a probabilistic programming language. The posterior probability distributions obtained via the black box variational inference engine in Stan were compared to those obtained with reference implementations of Markov chain Monte Carlo (MCMC) for phylogenetic inference. We find that black box variational inference in Stan is less accurate than MCMC methods for phylogenetic models, but requires far less compute time. Finally, we evaluate a custom implementation of mean-field variational inference on the Jukes-Cantor substitution model and show that a specialized implementation of variational inference can be two orders of magnitude faster and more accurate than a general purpose probabilistic implementation.


2010 ◽  
Vol 58 (1) ◽  
pp. 183-195 ◽  
Author(s):  
S. Amari ◽  
A. Cichocki

Information geometry of divergence functionsMeasures of divergence between two points play a key role in many engineering problems. One such measure is a distance function, but there are many important measures which do not satisfy the properties of the distance. The Bregman divergence, Kullback-Leibler divergence andf-divergence are such measures. In the present article, we study the differential-geometrical structure of a manifold induced by a divergence function. It consists of a Riemannian metric, and a pair of dually coupled affine connections, which are studied in information geometry. The class of Bregman divergences are characterized by a dually flat structure, which is originated from the Legendre duality. A dually flat space admits a generalized Pythagorean theorem. The class off-divergences, defined on a manifold of probability distributions, is characterized by information monotonicity, and the Kullback-Leibler divergence belongs to the intersection of both classes. Thef-divergence always gives the α-geometry, which consists of the Fisher information metric and a dual pair of ±α-connections. The α-divergence is a special class off-divergences. This is unique, sitting at the intersection of thef-divergence and Bregman divergence classes in a manifold of positive measures. The geometry derived from the Tsallisq-entropy and related divergences are also addressed.


Author(s):  
Michael J. Rothenberger ◽  
Hosam K. Fathy

This paper examines the challenge of shaping a battery’s input trajectory to (i) maximize its Fisher parameter identifiability while (ii) achieving robustness to parameter uncertainties. The paper is motivated by earlier research showing that the speed and accuracy with which battery parameters can be estimated both improve significantly when battery inputs are optimized for Fisher identifiability. Previous research performs this trajectory optimization for a known nominal parameter set. This creates a tautology where accurate parameter identification is a prerequisite for Fisher identifiability optimization. In contrast, this paper presents an iterative scheme that: (i) uses prior parameter probability distributions to create a weighted Fisher metric; (ii) optimizes the battery input trajectory for this metric using a genetic algorithm; (iii) applies the resulting input trajectory to the battery; (iv) estimates battery parameters using a Bayesian particle filter; (v) re-computes the weighted Fisher information metric using the resulting posterior parameter distribution; and (vi) repeats this process until convergence. This approach builds on well-established ideas from the estimation literature, and applies them to the battery domain for the first time. Simulation studies highlight the ability of this iterative algorithm to converge quickly towards the correct battery parameter values, despite large initial parameter uncertainties.


PeerJ ◽  
2019 ◽  
Vol 7 ◽  
pp. e8272 ◽  
Author(s):  
Mathieu Fourment ◽  
Aaron E. Darling

Recent advances in statistical machine learning techniques have led to the creation of probabilistic programming frameworks. These frameworks enable probabilistic models to be rapidly prototyped and fit to data using scalable approximation methods such as variational inference. In this work, we explore the use of the Stan language for probabilistic programming in application to phylogenetic models. We show that many commonly used phylogenetic models including the general time reversible substitution model, rate heterogeneity among sites, and a range of coalescent models can be implemented using a probabilistic programming language. The posterior probability distributions obtained via the black box variational inference engine in Stan were compared to those obtained with reference implementations of Markov chain Monte Carlo (MCMC) for phylogenetic inference. We find that black box variational inference in Stan is less accurate than MCMC methods for phylogenetic models, but requires far less compute time. Finally, we evaluate a custom implementation of mean-field variational inference on the Jukes–Cantor substitution model and show that a specialized implementation of variational inference can be two orders of magnitude faster and more accurate than a general purpose probabilistic implementation.


Sensors ◽  
2020 ◽  
Vol 20 (20) ◽  
pp. 5966
Author(s):  
Ke Wang ◽  
Gong Zhang

The challenge of small data has emerged in synthetic aperture radar automatic target recognition (SAR-ATR) problems. Most SAR-ATR methods are data-driven and require a lot of training data that are expensive to collect. To address this challenge, we propose a recognition model that incorporates meta-learning and amortized variational inference (AVI). Specifically, the model consists of global parameters and task-specific parameters. The global parameters, trained by meta-learning, construct a common feature extractor shared between all recognition tasks. The task-specific parameters, modeled by probability distributions, can adapt to new tasks with a small amount of training data. To reduce the computation and storage cost, the task-specific parameters are inferred by AVI implemented with set-to-set functions. Extensive experiments were conducted on a real SAR dataset to evaluate the effectiveness of the model. The results of the proposed approach compared with those of the latest SAR-ATR methods show the superior performance of our model, especially on recognition tasks with limited data.


Author(s):  
David Barber

Finding clusters of well-connected nodes in a graph is a problem common to many domains, including social networks, the Internet and bioinformatics. From a computational viewpoint, finding these clusters or graph communities is a difficult problem. We use a clique matrix decomposition based on a statistical description that encourages clusters to be well connected and few in number. The formal intractability of inferring the clusters is addressed using a variational approximation inspired by mean-field theories in statistical mechanics. Clique matrices also play a natural role in parametrizing positive definite matrices under zero constraints on elements of the matrix. We show that clique matrices can parametrize all positive definite matrices restricted according to a decomposable graph and form a structured factor analysis approximation in the non-decomposable case. Extensions to conjugate Bayesian covariance priors and more general non-Gaussian independence models are briefly discussed.


2018 ◽  
Vol 25 (3) ◽  
pp. 565-587 ◽  
Author(s):  
Mohamed Jardak ◽  
Olivier Talagrand

Abstract. Data assimilation is considered as a problem in Bayesian estimation, viz. determine the probability distribution for the state of the observed system, conditioned by the available data. In the linear and additive Gaussian case, a Monte Carlo sample of the Bayesian probability distribution (which is Gaussian and known explicitly) can be obtained by a simple procedure: perturb the data according to the probability distribution of their own errors, and perform an assimilation on the perturbed data. The performance of that approach, called here ensemble variational assimilation (EnsVAR), also known as ensemble of data assimilations (EDA), is studied in this two-part paper on the non-linear low-dimensional Lorenz-96 chaotic system, with the assimilation being performed by the standard variational procedure. In this first part, EnsVAR is implemented first, for reference, in a linear and Gaussian case, and then in a weakly non-linear case (assimilation over 5 days of the system). The performances of the algorithm, considered either as a probabilistic or a deterministic estimator, are very similar in the two cases. Additional comparison shows that the performance of EnsVAR is better, both in the assimilation and forecast phases, than that of standard algorithms for the ensemble Kalman filter (EnKF) and particle filter (PF), although at a higher cost. Globally similar results are obtained with the Kuramoto–Sivashinsky (K–S) equation.


2000 ◽  
Author(s):  
Paulo B. Gonçalves ◽  
Zenón J. G. N. Del Prado

Abstract This paper discusses the dynamic instability of circular cylindrical shells subjected to time-dependent axial edge loads of the form P(t) = P0+P1(t), where the dynamic component p1(t) is periodic in time and P0 is a uniform compressive load. In the present paper a low dimensional model, which retains the essential non-linear terms, is used to study the non-linear oscillations and instabilities of the shell. For this, Donnell’s shallow shell equations are used together with the Galerkin method to derive a set of coupled non-linear ordinary differential equations of motion which are, in turn, solved by the Runge-Kutta method. To study the non-linear behavior of the shell, several numerical strategies were used to obtain Poincaré maps, stable and unstable fixed points, bifurcation diagrams and basins of attraction. Particular attention is paid to two dynamic instability phenomena that may arise under these loading conditions: parametric instability and escape from the pre-buckling potential well. The numerical results obtained from this investigation clarify the conditions, which control whether or not instability may occur. This may help in establishing proper design criteria for these shells under dynamic loads, a topic practically unexplored in literature.


Author(s):  
Diana Mateus ◽  
Christian Wachinger ◽  
Selen Atasoy ◽  
Loren Schwarz ◽  
Nassir Navab

Computer aided diagnosis is often confronted with processing and analyzing high dimensional data. One alternative to deal with such data is dimensionality reduction. This chapter focuses on manifold learning methods to create low dimensional data representations adapted to a given application. From pairwise non-linear relations between neighboring data-points, manifold learning algorithms first approximate the low dimensional manifold where data lives with a graph; then, they find a non-linear map to embed this graph into a low dimensional space. Since the explicit pairwise relations and the neighborhood system can be designed according to the application, manifold learning methods are very flexible and allow easy incorporation of domain knowledge. The authors describe different assumptions and design elements that are crucial to building successful low dimensional data representations with manifold learning for a variety of applications. In particular, they discuss examples for visualization, clustering, classification, registration, and human-motion modeling.


Sign in / Sign up

Export Citation Format

Share Document