scholarly journals Demographic inference from multiple whole genomes using a particle filter for continuous Markov jump processes

PLoS ONE ◽  
2021 ◽  
Vol 16 (3) ◽  
pp. e0247647
Author(s):  
Donna Henderson ◽  
Sha (Joe) Zhu ◽  
Christopher B. Cole ◽  
Gerton Lunter

Demographic events shape a population’s genetic diversity, a process described by the coalescent-with-recombination model that relates demography and genetics by an unobserved sequence of genealogies along the genome. As the space of genealogies over genomes is large and complex, inference under this model is challenging. Formulating the coalescent-with-recombination model as a continuous-time and -space Markov jump process, we develop a particle filter for such processes, and use waypoints that under appropriate conditions allow the problem to be reduced to the discrete-time case. To improve inference, we generalise the Auxiliary Particle Filter for discrete-time models, and use Variational Bayes to model the uncertainty in parameter estimates for rare events, avoiding biases seen with Expectation Maximization. Using real and simulated genomes, we show that past population sizes can be accurately inferred over a larger range of epochs than was previously possible, opening the possibility of jointly analyzing multiple genomes under complex demographic models. Code is available at https://github.com/luntergroup/smcsmc.

2018 ◽  
Author(s):  
Donna Henderson ◽  
Sha (Joe) Zhu ◽  
Chris Cole ◽  
Gerton Lunter

AbstractDemographic events shape a population’s genetic diversity, a process described by the coalescent-with-recombination (CwR) model that relates demography and genetics by an unobserved sequence of genealogies. The space of genealogies over genomes is large and complex, making inference under this model challenging.We approximate the CwR with a continuous-time and -space Markov jump process. We develop a particle filter for such processes, using way-points to reduce the problem to the discrete-time case, and generalising the Auxiliary Particle Filter for discrete-time models. We use Variational Bayes for parameter inference to model the uncertainty in parameter estimates for rare events, avoiding biases seen with Expectation Maximization.Using real and simulated genomes, we show that past population sizes can be accurately inferred over a larger range of epochs than was previously possible, opening the possibility of jointly analyzing multiple genomes under complex demographic models.Code is available at https://github.com/luntergroup/smcsmcMSC 2010 subject classificationsPrimary 60G55, 62M05, 62M20, 62F15; secondary 92D25.


2014 ◽  
Vol 51 (3) ◽  
pp. 741-755
Author(s):  
Adam W. Grace ◽  
Dirk P. Kroese ◽  
Werner Sandmann

Many complex systems can be modeled via Markov jump processes. Applications include chemical reactions, population dynamics, and telecommunication networks. Rare-event estimation for such models can be difficult and is often computationally expensive, because typically many (or very long) paths of the Markov jump process need to be simulated in order to observe the rare event. We present a state-dependent importance sampling approach to this problem that is adaptive and uses Markov chain Monte Carlo to sample from the zero-variance importance sampling distribution. The method is applicable to a wide range of Markov jump processes and achieves high accuracy, while requiring only a small sample to obtain the importance parameters. We demonstrate its efficiency through benchmark examples in queueing theory and stochastic chemical kinetics.


1973 ◽  
Vol 5 (02) ◽  
pp. 287-307 ◽  
Author(s):  
Sidney I. Resnick ◽  
Michael Rubinovitch

An extremal-Fprocess {Y(t);t≧ 0} is defined as the continuous time analogue of sample sequences of maxima of i.i.d. r.v.'s distributed likeFin the same way that processes with stationary independent increments (s.i.i.) are the continuous time analogue of sample sums of i.i.d. r.v.'s with an infinitely divisible distribution. Extremal-F processes are stochastically continuous Markov jump processes which traverse the interval of concentration ofF.Most extremal processes of interest are broad sense equivalent to the largest positive jump of a suitable s.i.i. process and this together with known results from the theory of record values enables one to conclude that the number of jumps ofY(t) in (t1,t2] follows a Poisson distribution with parameter logt2/t1. The time transformationt→etgives a new jump process whose jumps occur according to a homogeneous Poisson process of rate 1. This fact leads to information about the jump times and the inter-jump times. WhenFis an extreme value distribution theY-process has special properties. The most important is that ifF(x) = exp {—e–x} thenY(t) has an additive structure. This structure plus non parametric techniques permit a variety of conclusions about the limiting behaviour ofY(t) and its jump times.


2014 ◽  
Vol 51 (03) ◽  
pp. 741-755
Author(s):  
Adam W. Grace ◽  
Dirk P. Kroese ◽  
Werner Sandmann

Many complex systems can be modeled via Markov jump processes. Applications include chemical reactions, population dynamics, and telecommunication networks. Rare-event estimation for such models can be difficult and is often computationally expensive, because typically many (or very long) paths of the Markov jump process need to be simulated in order to observe the rare event. We present a state-dependent importance sampling approach to this problem that is adaptive and uses Markov chain Monte Carlo to sample from the zero-variance importance sampling distribution. The method is applicable to a wide range of Markov jump processes and achieves high accuracy, while requiring only a small sample to obtain the importance parameters. We demonstrate its efficiency through benchmark examples in queueing theory and stochastic chemical kinetics.


2019 ◽  
Author(s):  
Colin S. Gillespie ◽  
Andrew Golightly

AbstractRare event probabilities play an important role in the understanding of the behaviour of biochemical systems. Due to the intractability of the most natural Markov jump process representation of a system of interest, rare event probabilities are typically estimated using importance sampling. While the resulting algorithm is reasonably well developed, the problem of choosing a suitable importance density is far from straightforward. We therefore leverage recent developments on simulation of conditioned jump processes to propose an importance density that is simple to implement and requires no tuning. Our results demonstrate superior performance over some existing approaches.


1973 ◽  
Vol 5 (2) ◽  
pp. 287-307 ◽  
Author(s):  
Sidney I. Resnick ◽  
Michael Rubinovitch

An extremal-F process {Y (t); t ≧ 0} is defined as the continuous time analogue of sample sequences of maxima of i.i.d. r.v.'s distributed like F in the same way that processes with stationary independent increments (s.i.i.) are the continuous time analogue of sample sums of i.i.d. r.v.'s with an infinitely divisible distribution. Extremal-F processes are stochastically continuous Markov jump processes which traverse the interval of concentration of F. Most extremal processes of interest are broad sense equivalent to the largest positive jump of a suitable s.i.i. process and this together with known results from the theory of record values enables one to conclude that the number of jumps of Y (t) in (t1, t2] follows a Poisson distribution with parameter log t2/t1. The time transformation t→ et gives a new jump process whose jumps occur according to a homogeneous Poisson process of rate 1. This fact leads to information about the jump times and the inter-jump times. When F is an extreme value distribution the Y-process has special properties. The most important is that if F(x) = exp {—e–x} then Y(t) has an additive structure. This structure plus non parametric techniques permit a variety of conclusions about the limiting behaviour of Y(t) and its jump times.


1994 ◽  
Vol 46 (06) ◽  
pp. 1238-1262 ◽  
Author(s):  
I. Iscoe ◽  
D. Mcdonald ◽  
K. Qian

Abstract We approximate the exit distribution of a Markov jump process into a set of forbidden states and we apply these general results to an ATM multiplexor. In this case the forbidden states represent an overloaded multiplexor. Statistics for this overload or busy period are difficult to obtain since this is such a rare event. Starting from the approximate exit distribution, one may simulate the busy period without wasting simulation time waiting for the overload to occur.


Author(s):  
Andrew Golightly ◽  
Darren J. Wilkinson

AbstractIn this paper we consider the problem of parameter inference for Markov jump process (MJP) representations of stochastic kinetic models. Since transition probabilities are intractable for most processes of interest yet forward simulation is straightforward, Bayesian inference typically proceeds through computationally intensive methods such as (particle) MCMC. Such methods ostensibly require the ability to simulate trajectories from the conditioned jump process. When observations are highly informative, use of the forward simulator is likely to be inefficient and may even preclude an exact (simulation based) analysis. We therefore propose three methods for improving the efficiency of simulating conditioned jump processes. A conditioned hazard is derived based on an approximation to the jump process, and used to generate end-point conditioned trajectories for use inside an importance sampling algorithm. We also adapt a recently proposed sequential Monte Carlo scheme to our problem. Essentially, trajectories are reweighted at a set of intermediate time points, with more weight assigned to trajectories that are consistent with the next observation. We consider two implementations of this approach, based on two continuous approximations of the MJP. We compare these constructs for a simple tractable jump process before using them to perform inference for a Lotka-Volterra system. The best performing construct is used to infer the parameters governing a simple model of motility regulation in


2020 ◽  
Author(s):  
Zeliha Kilic ◽  
Ioannis Sgouralis ◽  
Steve Pressé

AbstractThe hidden Markov model (HMM) is a framework for time series analysis widely applied to single molecule experiments. It has traditionally been used to interpret signals generated by systems, such as single molecules, evolving in a discrete state space observed at discrete time levels dictated by the data acquisition rate. Within the HMM framework, originally developed for applications outside the Natural Sciences, such as speech recognition, transitions between states, such as molecular conformational states, are modeled as occurring at the end of each data acquisition period and are described using transition probabilities. Yet, while measurements are often performed at discrete time levels in the Natural Sciences, physical systems evolve in continuous time according to transition rates. It then follows that the modeling assumptions underlying the HMM are justified if the transition rates of a physical process from state to state are small as compared to the data acquisition rate. In other words, HMMs apply to slow kinetics. The problem is, as the transition rates are unknown in principle, it is unclear, a priori, whether the HMM applies to a particular system. For this reason, we must generalize HMMs for physical systems, such as single molecules, as these switch between discrete states in continuous time. We do so by exploiting recent mathematical tools developed in the context of inferring Markov jump processes and propose the hidden Markov jump process (HMJP). We explicitly show in what limit the HMJP reduces to the HMM. Resolving the discrete time discrepancy of the HMM has clear implications: we no longer need to assume that processes, such as molecular events, must occur on timescales slower than data acquisition and can learn transition rates even if these are on the same timescale or otherwise exceed data acquisition rates.


Sign in / Sign up

Export Citation Format

Share Document