scholarly journals Estimating the relative proportions of SARS-CoV-2 strains from wastewater samples

Author(s):  
Lenore Pipes ◽  
Zihao Chen ◽  
Svetlana Afanaseva ◽  
Rasmus Nielsen

Wastewater surveillance has become essential for monitoring the spread of SARS-CoV-2. The quantification of SARS-CoV-2 RNA in wastewater correlates with the Covid-19 caseload in a community. However, estimating the proportions of different SARS-CoV-2 strains has remained technically difficult. We present a method for estimating the relative proportions of SARS-CoV-2 strains from wastewater samples. The method uses an initial step to remove unlikely strains, imputation of missing nucleotides using the global SARS-CoV-2 phylogeny, and an Expectation-Maximization (EM) algorithm for obtaining maximum likelihood estimates of the proportions of different strains in a sample. Using simulations with a reference database of >3 million SARS-CoV-2 genomes, we show that the estimated proportions accurately reflect the true proportions given sufficiently high sequencing depth and that the phylogenetic imputation is highly accurate and substantially improves the reference database.

2016 ◽  
Vol 16 (2) ◽  
pp. 16-34 ◽  
Author(s):  
D. Raja Kishor ◽  
N. B. Venkateswarlu

Abstract The present work proposes hybridization of Expectation-Maximization (EM) and K-means techniques as an attempt to speed-up the clustering process. Even though both the K-means and EM techniques look into different areas, K-means can be viewed as an approximate way to obtain maximum likelihood estimates for the means. Along with the proposed algorithm for hybridization, the present work also experiments with the Standard EM algorithm. Six different datasets, three of which synthetic datasets, are used for the experiments. Clustering fitness and Sum of Squared Errors (SSE) are computed for measuring the clustering performance. In all the experiments it is observed that the proposed algorithm for hybridization of EM and K-means techniques is consistently taking less execution time with acceptable Clustering Fitness value and less SSE than the standard EM algorithm. It is also observed that the proposed algorithm is producing better clustering results than the Cluster package of Purdue University.


2018 ◽  
Vol 41 (1) ◽  
pp. 75-86
Author(s):  
Taciana Shimizu ◽  
Francisco Louzada ◽  
Adriano Suzuki

In this paper, we consider to evaluate the efficiency of volleyball players according to the performance of attack, block and serve, but considering the compositional structure of the data related to the fundaments. The finite mixture of regression models better fitted the data in comparison with the usual regression model. The maximum likelihood estimates are obtained via an EM algorithm. A simulation study revels that the estimates are closer to the real values, the estimators are asymptotically unbiased for the parameters. A real Brazilian volleyball dataset related to the efficiency of the players is considered for the analysis.


2021 ◽  
Author(s):  
Masahiro Kuroda

Mixture models become increasingly popular due to their modeling flexibility and are applied to the clustering and classification of heterogeneous data. The EM algorithm is largely used for the maximum likelihood estimation of mixture models because the algorithm is stable in convergence and simple in implementation. Despite such advantages, it is pointed out that the EM algorithm is local and has slow convergence as the main drawback. To avoid the local convergence of the EM algorithm, multiple runs from several different initial values are usually used. Then the algorithm may take a large number of iterations and long computation time to find the maximum likelihood estimates. The speedup of computation of the EM algorithm is available for these problems. We give the algorithms to accelerate the convergence of the EM algorithm and apply them to mixture model estimation. Numerical experiments examine the performance of the acceleration algorithms in terms of the number of iterations and computation time.


2014 ◽  
Vol 1049-1050 ◽  
pp. 1343-1346
Author(s):  
Yong Li

EM algorithm is a very popular algorithm in missing data analysis. However,The variance of the estimator from EM is intractable. In this paper, we propose the supplemented EM algorithm for computing the variance that do not require computation and inversion of the information matrix.


2012 ◽  
Vol 2012 ◽  
pp. 1-19 ◽  
Author(s):  
Qihong Duan ◽  
Xiang Chen ◽  
Dengfu Zhao ◽  
Zheng Zhao

We study a multistate model for an aging piece of equipment under condition-based maintenance and apply an expectation maximization algorithm to obtain maximum likelihood estimates of the model parameters. Because of the monitoring discontinuity, we cannot observe any state's duration. The observation consists of the equipment's state at an inspection or right after a repair. Based on a proper construction of stochastic processes involved in the model, calculation of some probabilities and expectations becomes tractable. Using these probabilities and expectations, we can apply an expectation maximization algorithm to estimate the parameters in the model. We carry out simulation studies to test the accuracy and the efficiency of the algorithm.


2020 ◽  
Vol 72 (2) ◽  
pp. 122-132
Author(s):  
Junfeng Liu ◽  
Xiaoxia Zhang

For efficiently estimating the normal mean ([Formula: see text]) under right censoring (threshold =[Formula: see text], [Formula: see text] is known), we compare two approaches within the maximum likelihood estimation (MLE) framework. Approach I is a hierarchical MLE for which only the empirical censoring probability is utilized. Approach II is the direct MLE for which expectation-maximization (EM) algorithm is applied to all individual observations. We use discrete approximation to explain that the asymptotic variance of Approach II estimate equals the inverse Fisher information calculated from the full log-likelihood. We prove that Approach II gives a uniformly smaller asymptotic variance than Approach I and the variance ratio is a decreasing function of [Formula: see text]. We further prove some supportive results and graphically demonstrate that EM algorithm monotonically converges to the unique MLE.


Author(s):  
Asger Hobolth ◽  
Jens Ledet Jensen

We describe statistical inference in continuous time Markov processes of DNA sequences related by a phylogenetic tree. The maximum likelihood estimator can be found by the expectation maximization (EM) algorithm and an expression for the information matrix is also derived. We provide explicit analytical solutions for the EM algorithm and information matrix.


2002 ◽  
Vol 14 (6) ◽  
pp. 1261-1266 ◽  
Author(s):  
Akihiro Minagawa ◽  
Norio Tagawa ◽  
Toshiyuki Tanaka

The expectation-maximization (EM) algorithm with split-and-merge operations (SMEM algorithm) proposed by Ueda, Nakano, Ghahramani, and Hinton (2000) is a nonlocal searching method, applicable to mixture models, for relaxing the local optimum property of the EM algorithm. In this article, we point out that the SMEM algorithm uses the acceptance-rejection evaluation method, which may pick up a distribution with smaller likelihood, and demonstrate that an increase in likelihood can then be guaranteed only by comparing log likelihoods.


Sign in / Sign up

Export Citation Format

Share Document