Automatic determination of the number of components in the EM algorithm of restoration of a mixture of normal distributions

2010 ◽  
Vol 50 (4) ◽  
pp. 733-746 ◽  
Author(s):  
D. P. Vetrov ◽  
D. A. Kropotov ◽  
A. A. Osokin
2002 ◽  
Vol 66 (3) ◽  
pp. 183-193 ◽  
Author(s):  
Y. KITAMURA ◽  
M. MORIGUCHI ◽  
H. KANEKO ◽  
H. MORISAKI ◽  
T. MORISAKI ◽  
...  

2017 ◽  
Vol 27 (12) ◽  
pp. 3835-3838
Author(s):  
Iain L MacDonald

I comment here on a recent paper in this journal, on the fitting of truncated normal distributions by the EM algorithm. I show that the fitting of such distributions by direct numerical maximization of likelihood (rather than EM) is straightforward, contrary to an assertion made by the authors of that paper.


2016 ◽  
Vol 144 (10) ◽  
pp. 3783-3798 ◽  
Author(s):  
Kenneth R. Knapp ◽  
Jessica L. Matthews ◽  
James P. Kossin ◽  
Christopher C. Hennon

The Cyclone Center project maintains a website that allows visitors to answer questions based on tropical cyclone satellite imagery. The goal is to provide a reanalysis of satellite-derived tropical cyclone characteristics from a homogeneous historical database composed of satellite imagery with a common spatial resolution for use in long-term, global analyses. The determination of the cyclone “type” (curved band, eye, shear, etc.) is a starting point for this process. This analysis shows how multiple classifications of a single image are combined to provide probabilities of a particular image’s type using an expectation–maximization (EM) algorithm. Analysis suggests that the project needs about 10 classifications of an image to adequately determine the storm type. The algorithm is capable of characterizing classifiers with varying levels of expertise, though the project needs about 200 classifications to quantify an individual’s precision. The EM classifications are compared with an objective algorithm, satellite fix data, and the classifications of a known classifier. The EM classifications compare well, with best agreement for eye and embedded center storm types and less agreement for shear and when convection is too weak (termed no-storm images). Both the EM algorithm and the known classifier showed similar tendencies when compared against an objective algorithm. The EM algorithm also fared well when compared to tropical cyclone fix datasets, having higher agreement with embedded centers and less agreement for eye images. The results were used to show the distribution of storm types versus wind speed during a storm’s lifetime.


2021 ◽  
Vol 14 (2) ◽  
pp. 4-12
Author(s):  
Svetlana Evdokimova ◽  
Aleksandr Zhuravlev ◽  
Tatyana Novikova

This paper analyzes the buyers of the BigCar store, which sells spare parts for trucks, using clustering methods. The algorithms of k-means, g-means, EM and construction of Kohonen networks are considered. For their implementation, the Loginom Community analytical platform is used. Based on sales data for 3 years, buyers are divided into 3 clusters by implementing the k-means, EM algorithms and building a self-organizing Kohonen network. An EM algorithm was also performed with automatic determination of the number of clusters and g-means, which divided buyers into 9 and 10 clusters. The analysis of the resulting clusters showed that the results of the k-means and Kohonen algorithms are better suited to increase sales efficiency.


2010 ◽  
Vol 39 ◽  
pp. 151-156
Author(s):  
Jing Hua Bai ◽  
Kan Li ◽  
Xiao Xian Zhang

We propose a genetic-based and deterministic annealing expectation-maximization (GA&DA-EM) algorithm for learning Dirichlet mixture models from multivariate data. This algorithm is capable of selecting the number of components of the model using the minimum description length (MDL) criterion. Our approach benefits from the properties of Genetic algorithms and deterministic annealing algorithm by combination of both into a single procedure. The population-based stochastic search of the GA&DA explores the search space more thoroughly than the EM method. Therefore, our algorithm enables escaping from local optimal solutions since the algorithm becomes less sensitive to its initialization. The GA&DA-EM algorithm is elitist which maintains the monotonic convergence property of the EM algorithm. We conducted experiments on the WebKB and 20NEWSGROUPS. The results show that show that 1) the GA&DA-EM outperforms the EM method since: Our approach identifies the number of components which were used to generate the underlying data more often than the EM algorithm. 2) the algorithm alternatives to EM that overcoming the challenges of local maxima.


Sign in / Sign up

Export Citation Format

Share Document