Probabilistic programming in Python using PyMC3

Probabilistic programming allows for automatic Bayesian inference on user-defined probabilistic models. Recent advances in Markov chain Monte Carlo (MCMC) sampling allow inference on increasingly complex models. This class of MCMC, known as Hamiltonian Monte Carlo, requires gradient information which is often not readily available. PyMC3 is a new open source probabilistic programming framework written in Python that uses Theano to compute gradients via automatic differentiation as well as compile probabilistic programs on-the-fly to C for increased speed. Contrary to other probabilistic programming languages, PyMC3 allows model specification directly in Python code. The lack of a domain specific language allows for great flexibility and direct interaction with the model. This paper is a tutorial-style introduction to this software package.

Download Full-text

Probabilistic programming in Python using PyMC3

10.7287/peerj.preprints.1686 ◽

2016 ◽

Cited By ~ 8

Author(s):

John Salvatier ◽

Thomas V Wiecki ◽

Christopher Fonnesbeck

Keyword(s):

Monte Carlo ◽

Programming Languages ◽

Probabilistic Models ◽

Automatic Differentiation ◽

Direct Interaction ◽

Model Specification ◽

Probabilistic Programming ◽

Domain Specific ◽

Complex Models ◽

Probabilistic Programs

Probabilistic Programming allows for automatic Bayesian inference on user-defined probabilistic models. Recent advances in Markov chain Monte Carlo (MCMC) sampling allow inference on increasingly complex models. This class of MCMC, known as Hamliltonian Monte Carlo, requires gradient information which is often not readily available. PyMC3 is a new open source Probabilistic Programming framework written in Python that uses Theano to compute gradients via automatic differentiation as well as compile probabilistic programs on-the-fly to C for increased speed. Contrary to other Probabilistic Programming languages, PyMC3 allows model specification directly in Python code. The lack of a domain specific language allows for great flexibility and direct interaction with the model. This paper is a tutorial-style introduction to this software package.

Download Full-text

Correctness of Sequential Monte Carlo Inference for Probabilistic Programming Languages

10.26226/morressier.604907f41a80aac83ca25d23 ◽

2021 ◽

Cited By ~ 1

Author(s):

Daniel Lundén ◽

Johannes Borgström ◽

David Broman

Keyword(s):

Monte Carlo ◽

Programming Languages ◽

Sequential Monte Carlo ◽

Probabilistic Programming

Download Full-text

Correctness of Sequential Monte Carlo Inference for Probabilistic Programming Languages

Programming Languages and Systems - Lecture Notes in Computer Science ◽

10.1007/978-3-030-72019-3_15 ◽

2021 ◽

pp. 404-431

Author(s):

Daniel Lundén ◽

Johannes Borgström ◽

David Broman

Keyword(s):

Monte Carlo ◽

Programming Languages ◽

Sequential Monte Carlo ◽

Fundamental Problem ◽

Operational Semantics ◽

Correctness Proof ◽

Probabilistic Programming ◽

Inference Problems ◽

Large Numbers ◽

Open Question

AbstractProbabilistic programming is an approach to reasoning under uncertainty by encoding inference problems as programs. In order to solve these inference problems, probabilistic programming languages (PPLs) employ different inference algorithms, such as sequential Monte Carlo (SMC), Markov chain Monte Carlo (MCMC), or variational methods. Existing research on such algorithms mainly concerns their implementation and efficiency, rather than the correctness of the algorithms themselves when applied in the context of expressive PPLs. To remedy this, we give a correctness proof for SMC methods in the context of an expressive PPL calculus, representative of popular PPLs such as WebPPL, Anglican, and Birch. Previous work have studied correctness of MCMC using an operational semantics, and correctness of SMC and MCMC in a denotational setting without term recursion. However, for SMC inference—one of the most commonly used algorithms in PPLs as of today—no formal correctness proof exists in an operational setting. In particular, an open question is if the resample locations in a probabilistic program affects the correctness of SMC. We solve this fundamental problem, and make four novel contributions: (i) we extend an untyped PPL lambda calculus and operational semantics to include explicit resample terms, expressing synchronization points in SMC inference; (ii) we prove, for the first time, that subject to mild restrictions, any placement of the explicit resample terms is valid for a generic form of SMC inference; (iii) as a result of (ii), our calculus benefits from classic results from the SMC literature: a law of large numbers and an unbiased estimate of the model evidence; and (iv) we formalize the bootstrap particle filter for the calculus and discuss how our results can be further extended to other SMC algorithms.

Download Full-text

Conditional Independence by Typing

ACM Transactions on Programming Languages and Systems ◽

10.1145/3490421 ◽

2022 ◽

Vol 44 (1) ◽

pp. 1-54

Author(s):

Maria I. Gorinova ◽

Andrew D. Gordon ◽

Charles Sutton ◽

Matthijs Vákár

Keyword(s):

Programming Languages ◽

Conditional Independence ◽

Probabilistic Models ◽

Type System ◽

Type Inference ◽

Practical Application ◽

Probabilistic Programming ◽

Variable Elimination ◽

Gradient Based ◽

Inference Methods

A central goal of probabilistic programming languages (PPLs) is to separate modelling from inference. However, this goal is hard to achieve in practice. Users are often forced to re-write their models to improve efficiency of inference or meet restrictions imposed by the PPL. Conditional independence (CI) relationships among parameters are a crucial aspect of probabilistic models that capture a qualitative summary of the specified model and can facilitate more efficient inference. We present an information flow type system for probabilistic programming that captures conditional independence (CI) relationships and show that, for a well-typed program in our system, the distribution it implements is guaranteed to have certain CI-relationships. Further, by using type inference, we can statically deduce which CI-properties are present in a specified model. As a practical application, we consider the problem of how to perform inference on models with mixed discrete and continuous parameters. Inference on such models is challenging in many existing PPLs, but can be improved through a workaround, where the discrete parameters are used implicitly , at the expense of manual model re-writing. We present a source-to-source semantics-preserving transformation, which uses our CI-type system to automate this workaround by eliminating the discrete parameters from a probabilistic program. The resulting program can be seen as a hybrid inference algorithm on the original program, where continuous parameters can be drawn using efficient gradient-based inference methods, while the discrete parameters are inferred using variable elimination. We implement our CI-type system and its example application in SlicStan: a compositional variant of Stan. 1

Download Full-text

Increasing Interpretability of Bayesian Probabilistic Programming Models Through Interactive Representations

Frontiers in Computer Science ◽

10.3389/fcomp.2020.567344 ◽

2020 ◽

Vol 2 ◽

Author(s):

Evdoxia Taka ◽

Sebastian Stein ◽

John H. Williamson

Keyword(s):

Probabilistic Models ◽

Graphical Representation ◽

Decision Makers ◽

Programming Models ◽

Graphical Representations ◽

Seamless Integration ◽

Probabilistic Programming ◽

Uncertainty Visualization ◽

Visualization Tools ◽

Probabilistic Programs

Bayesian probabilistic modeling is supported by powerful computational tools like probabilistic programming and efficient Markov Chain Monte Carlo (MCMC) sampling. However, the results of Bayesian inference are challenging for users to interpret in tasks like decision-making under uncertainty or model refinement. Decision-makers need simultaneous insight into both the model's structure and its predictions, including uncertainty in inferred parameters. This enables better assessment of the risk overall possible outcomes compatible with observations and thus more informed decisions. To support this, we see a need for visualization tools that make probabilistic programs interpretable to reveal the interdependencies in probabilistic models and their inherent uncertainty. We propose the automatic transformation of Bayesian probabilistic models, expressed in a probabilistic programming language, into an interactive graphical representation of the model's structure at varying levels of granularity, with seamless integration of uncertainty visualization. This interactive graphical representation supports the exploration of the prior and posterior distribution of MCMC samples. The interpretability of Bayesian probabilistic programming models is enhanced through the interactive graphical representations, which provide human users with more informative, transparent, and explainable probabilistic models. We present a concrete implementation that translates probabilistic programs to interactive graphical representations and show illustrative examples for a variety of Bayesian probabilistic models.

Download Full-text

Investigation and Impact of Knowledge-Based Information on Programming Languages Based on Probabilistic Models

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.39.436 ◽

2010 ◽

Vol 39 ◽

pp. 436-440

Author(s):

Zhi Ming Qu

Keyword(s):

Expert Systems ◽

Programming Languages ◽

Probabilistic Models ◽

Role Playing ◽

Position Paper ◽

The Other ◽

Erasure Coding ◽

Role Playing Games ◽

Knowledge Based ◽

Other Hand

In recent years, much research has been devoted to the refinement of IPv6; on the other hand, few have investigated the confusing unification of interrupts and Internet QoS. In this position paper, it demonstrates the emulation of interrupts. In order to overcome this quagmire, a novel system is presented for the intuitive unification of expert systems and massive multiplayer online role-playing games. It is concluded that erasure coding can be verified to make heterogeneous, interposable, and event-driven, which is proved to be applicable.

Download Full-text

Visual differential debugging in the domain-specific visual programming languages

2020 International Conference Automatics and Informatics (ICAI) ◽

10.1109/icai50593.2020.9311308 ◽

2020 ◽

Author(s):

Alexander Penev ◽

Martin Vassilev

Keyword(s):

Programming Languages ◽

Visual Programming ◽

Domain Specific

Download Full-text

Stochastic Automatic Differentiation - AAD for Monte-Carlo Simulations (Presentation at the 13th Fixed Income Conference) (Presentation Slides)

SSRN Electronic Journal ◽

10.2139/ssrn.3077197 ◽

2017 ◽

Author(s):

Christian P. Fries

Keyword(s):

Monte Carlo ◽

Monte Carlo Simulations ◽

Automatic Differentiation ◽

Fixed Income

Download Full-text

Evaluating probabilistic programming and fast variational Bayesian inference in phylogenetics

10.1101/702944 ◽

2019 ◽

Cited By ~ 1

Author(s):

Mathieu Fourment ◽

Aaron E. Darling

Keyword(s):

Probabilistic Models ◽

Probability Distributions ◽

Mean Field ◽

Black Box ◽

Variational Inference ◽

Machine Learning Techniques ◽

Mcmc Methods ◽

Substitution Model ◽

Probabilistic Programming ◽

Phylogenetic Models

AbstractRecent advances in statistical machine learning techniques have led to the creation of probabilistic programming frameworks. These frameworks enable probabilistic models to be rapidly prototyped and fit to data using scalable approximation methods such as variational inference. In this work, we explore the use of the Stan language for probabilistic programming in application to phylogenetic models. We show that many commonly used phylogenetic models including the general time reversible (GTR) substitution model, rate heterogeneity among sites, and a range of coalescent models can be implemented using a probabilistic programming language. The posterior probability distributions obtained via the black box variational inference engine in Stan were compared to those obtained with reference implementations of Markov chain Monte Carlo (MCMC) for phylogenetic inference. We find that black box variational inference in Stan is less accurate than MCMC methods for phylogenetic models, but requires far less compute time. Finally, we evaluate a custom implementation of mean-field variational inference on the Jukes-Cantor substitution model and show that a specialized implementation of variational inference can be two orders of magnitude faster and more accurate than a general purpose probabilistic implementation.

Download Full-text