Probabilistic Programming Languages

AbstractProbabilistic programming is an approach to reasoning under uncertainty by encoding inference problems as programs. In order to solve these inference problems, probabilistic programming languages (PPLs) employ different inference algorithms, such as sequential Monte Carlo (SMC), Markov chain Monte Carlo (MCMC), or variational methods. Existing research on such algorithms mainly concerns their implementation and efficiency, rather than the correctness of the algorithms themselves when applied in the context of expressive PPLs. To remedy this, we give a correctness proof for SMC methods in the context of an expressive PPL calculus, representative of popular PPLs such as WebPPL, Anglican, and Birch. Previous work have studied correctness of MCMC using an operational semantics, and correctness of SMC and MCMC in a denotational setting without term recursion. However, for SMC inference—one of the most commonly used algorithms in PPLs as of today—no formal correctness proof exists in an operational setting. In particular, an open question is if the resample locations in a probabilistic program affects the correctness of SMC. We solve this fundamental problem, and make four novel contributions: (i) we extend an untyped PPL lambda calculus and operational semantics to include explicit resample terms, expressing synchronization points in SMC inference; (ii) we prove, for the first time, that subject to mild restrictions, any placement of the explicit resample terms is valid for a generic form of SMC inference; (iii) as a result of (ii), our calculus benefits from classic results from the SMC literature: a law of large numbers and an unbiased estimate of the model evidence; and (iv) we formalize the bootstrap particle filter for the calculus and discuss how our results can be further extended to other SMC algorithms.

Download Full-text

Conditional Independence by Typing

ACM Transactions on Programming Languages and Systems ◽

10.1145/3490421 ◽

2022 ◽

Vol 44 (1) ◽

pp. 1-54

Author(s):

Maria I. Gorinova ◽

Andrew D. Gordon ◽

Charles Sutton ◽

Matthijs Vákár

Keyword(s):

Programming Languages ◽

Conditional Independence ◽

Probabilistic Models ◽

Type System ◽

Type Inference ◽

Practical Application ◽

Probabilistic Programming ◽

Variable Elimination ◽

Gradient Based ◽

Inference Methods

A central goal of probabilistic programming languages (PPLs) is to separate modelling from inference. However, this goal is hard to achieve in practice. Users are often forced to re-write their models to improve efficiency of inference or meet restrictions imposed by the PPL. Conditional independence (CI) relationships among parameters are a crucial aspect of probabilistic models that capture a qualitative summary of the specified model and can facilitate more efficient inference. We present an information flow type system for probabilistic programming that captures conditional independence (CI) relationships and show that, for a well-typed program in our system, the distribution it implements is guaranteed to have certain CI-relationships. Further, by using type inference, we can statically deduce which CI-properties are present in a specified model. As a practical application, we consider the problem of how to perform inference on models with mixed discrete and continuous parameters. Inference on such models is challenging in many existing PPLs, but can be improved through a workaround, where the discrete parameters are used implicitly , at the expense of manual model re-writing. We present a source-to-source semantics-preserving transformation, which uses our CI-type system to automate this workaround by eliminating the discrete parameters from a probabilistic program. The resulting program can be seen as a hybrid inference algorithm on the original program, where continuous parameters can be drawn using efficient gradient-based inference methods, while the discrete parameters are inferred using variable elimination. We implement our CI-type system and its example application in SlicStan: a compositional variant of Stan. 1

Download Full-text

MCMC Estimation of Conditional Probabilities in Probabilistic Programming Languages

Lecture Notes in Computer Science - Symbolic and Quantitative Approaches to Reasoning with Uncertainty ◽

10.1007/978-3-642-39091-3_37 ◽

2013 ◽

pp. 436-448 ◽

Cited By ~ 2

Author(s):

Bogdan Moldovan ◽

Ingo Thon ◽

Jesse Davis ◽

Luc de Raedt

Keyword(s):

Programming Languages ◽

Conditional Probabilities ◽

Probabilistic Programming ◽

Mcmc Estimation

Download Full-text

Universal probabilistic programming offers a powerful approach to statistical phylogenetics

Communications Biology ◽

10.1038/s42003-021-01753-7 ◽

2021 ◽

Vol 4 (1) ◽

Author(s):

Fredrik Ronquist ◽

Jan Kudlicka ◽

Viktor Senderov ◽

Johannes Borgström ◽

Nicolas Lartillot ◽

...

Keyword(s):

Programming Languages ◽

Graphical Models ◽

Sequential Monte Carlo ◽

Full Range ◽

Efficient Estimation ◽

Probabilistic Programming ◽

Automated Generation ◽

Inference Algorithms ◽

Powerful Approach ◽

Inference Strategy

AbstractStatistical phylogenetic analysis currently relies on complex, dedicated software packages, making it difficult for evolutionary biologists to explore new models and inference strategies. Recent years have seen more generic solutions based on probabilistic graphical models, but this formalism can only partly express phylogenetic problems. Here, we show that universal probabilistic programming languages (PPLs) solve the expressivity problem, while still supporting automated generation of efficient inference algorithms. To prove the latter point, we develop automated generation of sequential Monte Carlo (SMC) algorithms for PPL descriptions of arbitrary biological diversification (birth-death) models. SMC is a new inference strategy for these problems, supporting both parameter inference and efficient estimation of Bayes factors that are used in model testing. We take advantage of this in automatically generating SMC algorithms for several recent diversification models that have been difficult or impossible to tackle previously. Finally, applying these algorithms to 40 bird phylogenies, we show that models with slowing diversification, constant turnover and many small shifts generally explain the data best. Our work opens up several related problem domains to PPL approaches, and shows that few hurdles remain before these techniques can be effectively applied to the full range of phylogenetic models.

Download Full-text

Universal probabilistic programming offers a powerful approach to statistical phylogenetics

10.1101/2020.06.16.154443 ◽

2020 ◽

Cited By ~ 1

Author(s):

Fredrik Ronquist ◽

Jan Kudlicka ◽

Viktor Senderov ◽

Johannes Borgström ◽

Nicolas Lartillot ◽

...

Keyword(s):

Programming Languages ◽

Graphical Models ◽

Sequential Monte Carlo ◽

Full Range ◽

Efficient Estimation ◽

Probabilistic Programming ◽

Automated Generation ◽

Inference Algorithms ◽

Powerful Approach ◽

Inference Strategy

Statistical phylogenetic analysis currently relies on complex, dedicated software packages, making it difficult for evolutionary biologists to explore new models and inference strategies. Recent years have seen more generic solutions based on probabilistic graphical models, but this formalism can only partly express phylogenetic problems. Here we show that universal probabilistic programming languages (PPLs) solve the expressivity problem, while still supporting automated generation of efficient inference algorithms. To prove the latter point, we develop automated generation of sequential Monte Carlo (SMC) algorithms for PPL descriptions of arbitrary biological diversification (birth-death) models. SMC is a new inference strategy for these problems, supporting both parameter inference and efficient estimation of Bayes factors that are used in model testing. We take advantage of this in automatically generating SMC algorithms for several recent diversification models that have been difficult or impossible to tackle previously. Finally, applying these algorithms to 40 bird phylogenies, we show that models with slowing diversification, constant turnover and many small shifts generally explain the data best. Our work opens up several related problem domains to PPL approaches, and shows that few hurdles remain before these techniques can be effectively applied to the full range of phylogenetic models.

Download Full-text

Comparison of probabilistic programming languages, using the solution of clustering problems and the distinguishing of attributes as an example

Journal of Optical Technology ◽

10.1364/jot.82.000542 ◽

2015 ◽

Vol 82 (8) ◽

pp. 542

Author(s):

V. I. Filatov ◽

A. S. Potapov

Keyword(s):

Programming Languages ◽

Probabilistic Programming ◽

Clustering Problems

Download Full-text

Implementing a Library for Probabilistic Programming Using Non-strict Non-determinism

Theory and Practice of Logic Programming ◽

10.1017/s1471068419000085 ◽

2019 ◽

Vol 20 (1) ◽

pp. 147-175 ◽

Cited By ~ 1

Author(s):

SANDRA DYLUS ◽

JAN CHRISTIANSEN ◽

FINN TEEGEN

Keyword(s):

Logic Programming ◽

Programming Languages ◽

Programming Language ◽

Probabilistic Choice ◽

Probabilistic Programming ◽

Logic Programming Language ◽

Functional Logic Programming ◽

Functional Logic ◽

Language Characteristics ◽

Standard List

AbstractThis paper presentsPFLP, a library for probabilistic programming in the functional logic programming language Curry. It demonstrates how the concepts of a functional logic programming language support the implementation of a library for probabilistic programming. In fact, the paradigms of functional logic and probabilistic programming are closely connected. That is, language characteristics from one area exist in the other and vice versa. For example, the concepts of non-deterministic choice and call-time choice as known from functional logic programming are related to and coincide with stochastic memoization and probabilistic choice in probabilistic programming, respectively. We will further see that an implementation based on the concepts of functional logic programming can have benefits with respect to performance compared to a standard list-based implementation and can even compete with full-blown probabilistic programming languages, which we illustrate by several benchmarks.

Download Full-text