A Markov chain Monte Carlo Expectation Maximization Algorithm for Statistical Analysis of DNA Sequence Evolution with Neighbor-Dependent Substitution Rates

2008 ◽  
Vol 17 (1) ◽  
pp. 138-162 ◽  
Author(s):  
Asger Hobolth
2018 ◽  
Vol 12 (3) ◽  
pp. 253-272 ◽  
Author(s):  
Chanseok Park

The expectation–maximization algorithm is a powerful computational technique for finding the maximum likelihood estimates for parametric models when the data are not fully observed. The expectation–maximization is best suited for situations where the expectation in each E-step and the maximization in each M-step are straightforward. A difficulty with the implementation of the expectation–maximization algorithm is that each E-step requires the integration of the log-likelihood function in closed form. The explicit integration can be avoided by using what is known as the Monte Carlo expectation–maximization algorithm. The Monte Carlo expectation–maximization uses a random sample to estimate the integral at each E-step. But the problem with the Monte Carlo expectation–maximization is that it often converges to the integral quite slowly and the convergence behavior can also be unstable, which causes computational burden. In this paper, we propose what we refer to as the quantile variant of the expectation–maximization algorithm. We prove that the proposed method has an accuracy of [Formula: see text], while the Monte Carlo expectation–maximization method has an accuracy of [Formula: see text]. Thus, the proposed method possesses faster and more stable convergence properties when compared with the Monte Carlo expectation–maximization algorithm. The improved performance is illustrated through the numerical studies. Several practical examples illustrating its use in interval-censored data problems are also provided.


Genetics ◽  
2001 ◽  
Vol 158 (2) ◽  
pp. 885-896 ◽  
Author(s):  
Rasmus Nielsen ◽  
John Wakeley

Abstract A Markov chain Monte Carlo method for estimating the relative effects of migration and isolation on genetic diversity in a pair of populations from DNA sequence data is developed and tested using simulations. The two populations are assumed to be descended from a panmictic ancestral population at some time in the past and may (or may not) after that be connected by migration. The use of a Markov chain Monte Carlo method allows the joint estimation of multiple demographic parameters in either a Bayesian or a likelihood framework. The parameters estimated include the migration rate for each population, the time since the two populations diverged from a common ancestral population, and the relative size of each of the two current populations and of the common ancestral population. The results show that even a single nonrecombining genetic locus can provide substantial power to test the hypothesis of no ongoing migration and/or to test models of symmetric migration between the two populations. The use of the method is illustrated in an application to mitochondrial DNA sequence data from a fish species: the threespine stickleback (Gasterosteus aculeatus).


Sign in / Sign up

Export Citation Format

Share Document