Hierarchical Mixtures of Experts and the EM Algorithm

We present a tree-structured architecture for supervised learning. The statistical model underlying the architecture is a hierarchical mixture model in which both the mixture coefficients and the mixture components are generalized linear models (GLIM's). Learning is treated as a maximum likelihood problem; in particular, we present an Expectation-Maximization (EM) algorithm for adjusting the parameters of the architecture. We also develop an on-line learning algorithm in which the parameters are updated incrementally. Comparative simulation results are presented in the robot dynamics domain.

Download Full-text

Smooth On-Line Learning Algorithms for Hidden Markov Models

Neural Computation ◽

10.1162/neco.1994.6.2.307 ◽

1994 ◽

Vol 6 (2) ◽

pp. 307-318 ◽

Cited By ~ 56

Author(s):

Pierre Baldi ◽

Yves Chauvin

Keyword(s):

Hidden Markov Models ◽

Expectation Maximization ◽

Markov Models ◽

Learning Algorithm ◽

Hidden Markov ◽

Batch Mode ◽

On Line ◽

Maximization Algorithms ◽

On Line Learning ◽

Entropy Functions

A simple learning algorithm for Hidden Markov Models (HMMs) is presented together with a number of variations. Unlike other classical algorithms such as the Baum-Welch algorithm, the algorithms described are smooth and can be used on-line (after each example presentation) or in batch mode, with or without the usual Viterbi most likely path approximation. The algorithms have simple expressions that result from using a normalized-exponential representation for the HMM parameters. All the algorithms presented are proved to be exact or approximate gradient optimization algorithms with respect to likelihood, log-likelihood, or cross-entropy functions, and as such are usually convergent. These algorithms can also be casted in the more general EM (Expectation-Maximization) framework where they can be viewed as exact or approximate GEM (Generalized Expectation-Maximization) algorithms. The mathematical properties of the algorithms are derived in the appendix.

Download Full-text

On-line EM Algorithm for the Normalized Gaussian Network

Neural Computation ◽

10.1162/089976600300015853 ◽

2000 ◽

Vol 12 (2) ◽

pp. 407-432 ◽

Cited By ~ 182

Author(s):

Masa-aki Sato ◽

Shin Ishii

Keyword(s):

Em Algorithm ◽

Dynamic Environments ◽

Local Linear Regression ◽

Discount Factor ◽

Robot Dynamics ◽

Mixtures Of Experts ◽

On Line ◽

Dynamics Problems ◽

Changes Over Time ◽

Gaussian Network

A normalized gaussian network (NGnet) (Moody & Darken, 1989) is a network of local linear regression units. The model softly partitions the input space by normalized gaussian functions, and each local unit linearly approximates the output within the partition. In this article, we propose a new on-line EM algorithm for the NGnet, which is derived from the batch EM algorithm (Xu, Jordan, & Hinton 1995), by introducing a discount factor. We show that the on-line EM algorithm is equivalent to the batch EM algorithm if a specific scheduling of the discount factor is employed. In addition, we show that the on-line EM algorithm can be considered as a stochastic approximation method to find the maximum likelihood estimator. A new regularization method is proposed in order to deal with a singular input distribution. In order to manage dynamic environments, where the input-output distribution of data changes over time, unit manipulation mechanisms such as unit production, unit deletion, and unit division are also introduced based on probabilistic interpretation. Experimental results show that our approach is suitable for function approximation problems in dynamic environments. We also apply our on-line EM algorithm to robot dynamics problems and compare our algorithm with the mixtures-of-experts family.

Download Full-text

Unsupervised Learning in RSS-Based DFLT Using an EM Algorithm

Sensors ◽

10.3390/s21165549 ◽

2021 ◽

Vol 21 (16) ◽

pp. 5549

Author(s):

Ossi Kaltiokallio ◽

Roland Hostettler ◽

Hüseyin Yiğitler ◽

Mikko Valkama

Keyword(s):

Em Algorithm ◽

Expectation Maximization ◽

Tracking System ◽

Model Parameters ◽

Calibration Data ◽

Tracking Accuracy ◽

Time Period ◽

The Em Algorithm ◽

Device Free ◽

Localization And Tracking

Received signal strength (RSS) changes of static wireless nodes can be used for device-free localization and tracking (DFLT). Most RSS-based DFLT systems require access to calibration data, either RSS measurements from a time period when the area was not occupied by people, or measurements while a person stands in known locations. Such calibration periods can be very expensive in terms of time and effort, making system deployment and maintenance challenging. This paper develops an Expectation-Maximization (EM) algorithm based on Gaussian smoothing for estimating the unknown RSS model parameters, liberating the system from supervised training and calibration periods. To fully use the EM algorithm’s potential, a novel localization-and-tracking system is presented to estimate a target’s arbitrary trajectory. To demonstrate the effectiveness of the proposed approach, it is shown that: (i) the system requires no calibration period; (ii) the EM algorithm improves the accuracy of existing DFLT methods; (iii) it is computationally very efficient; and (iv) the system outperforms a state-of-the-art adaptive DFLT system in terms of tracking accuracy.

Download Full-text

Models and Algorithms for Tracking Target with Coordinated Turn Motion

Mathematical Problems in Engineering ◽

10.1155/2014/649276 ◽

2014 ◽

Vol 2014 ◽

pp. 1-10 ◽

Cited By ~ 8

Author(s):

Xianghui Yuan ◽

Feng Lian ◽

Chongzhao Han

Keyword(s):

Em Algorithm ◽

Expectation Maximization ◽

Multiple Models ◽

Interacting Multiple Model ◽

Multiple Model ◽

Motion Model ◽

Single Model ◽

Kinematic Constraint ◽

The Em Algorithm ◽

Turn Rate

Tracking target with coordinated turn (CT) motion is highly dependent on the models and algorithms. First, the widely used models are compared in this paper—coordinated turn (CT) model with known turn rate, augmented coordinated turn (ACT) model with Cartesian velocity, ACT model with polar velocity, CT model using a kinematic constraint, and maneuver centered circular motion model. Then, in the single model tracking framework, the tracking algorithms for the last four models are compared and the suggestions on the choice of models for different practical target tracking problems are given. Finally, in the multiple models (MM) framework, the algorithm based on expectation maximization (EM) algorithm is derived, including both the batch form and the recursive form. Compared with the widely used interacting multiple model (IMM) algorithm, the EM algorithm shows its effectiveness.

Download Full-text

Improved Initialization of the EM Algorithm for Mixture Model Parameter Estimation

Mathematics ◽

10.3390/math8030373 ◽

2020 ◽

Vol 8 (3) ◽

pp. 373

Author(s):

Branislav Panić ◽

Jernej Klemenc ◽

Marko Nagode

Keyword(s):

Em Algorithm ◽

Density Estimation ◽

Mixture Model ◽

Expectation Maximization ◽

State Of The Art ◽

R Package ◽

Likelihood Estimator ◽

Local Optima ◽

The Em Algorithm ◽

Initialization Algorithm

A commonly used tool for estimating the parameters of a mixture model is the Expectation–Maximization (EM) algorithm, which is an iterative procedure that can serve as a maximum-likelihood estimator. The EM algorithm has well-documented drawbacks, such as the need for good initial values and the possibility of being trapped in local optima. Nevertheless, because of its appealing properties, EM plays an important role in estimating the parameters of mixture models. To overcome these initialization problems with EM, in this paper, we propose the Rough-Enhanced-Bayes mixture estimation (REBMIX) algorithm as a more effective initialization algorithm. Three different strategies are derived for dealing with the unknown number of components in the mixture model. These strategies are thoroughly tested on artificial datasets, density–estimation datasets and image–segmentation problems and compared with state-of-the-art initialization methods for the EM. Our proposal shows promising results in terms of clustering and density-estimation performance as well as in terms of computational efficiency. All the improvements are implemented in the rebmix R package.

Download Full-text

Doppler Velocity Estimation of Overlapping Linear-Period-Modulated Ultrasonic Waves Based on an Expectation-Maximization Algorithm

Advances in Acoustics and Vibration ◽

10.1155/2014/921876 ◽

2014 ◽

Vol 2014 ◽

pp. 1-7 ◽

Cited By ~ 1

Author(s):

Natee Thong-un ◽

Minoru K. Kurosawa

Keyword(s):

Em Algorithm ◽

Expectation Maximization ◽

Pulse Compression ◽

Moving Objects ◽

Expectation Maximization Algorithm ◽

Ultrasonic Waves ◽

Velocity Estimation ◽

Doppler Velocity ◽

The Em Algorithm ◽

Linear Period

The occurrence of an overlapping signal is a significant problem in performing multiple objects localization. Doppler velocity is sensitive to the echo shape and is also able to be connected to the physical properties of moving objects, especially for a pulse compression ultrasonic signal. The expectation-maximization (EM) algorithm has the ability to achieve signal separation. Thus, applying the EM algorithm to the overlapping pulse compression signals is of interest. This paper describes a proposed method, based on the EM algorithm, of Doppler velocity estimation for overlapping linear-period-modulated (LPM) ultrasonic signals. Simulations are used to validate the proposed method.

Download Full-text

On-line learning algorithm for recurrent neural networks using variational methods

Computer Standards & Interfaces ◽

10.1016/s0920-5489(99)90979-0 ◽

1999 ◽

Vol 20 (6-7) ◽

pp. 457

Author(s):

Won-Geun Oh ◽

Byung-Suhl Suh

Keyword(s):

Neural Networks ◽

Variational Methods ◽

Recurrent Neural Networks ◽

Learning Algorithm ◽

On Line ◽

On Line Learning

Download Full-text

Theory and Practice of Expectation Maximization (EM) Algorithm

Encyclopedia of Data Warehousing and Mining, Second Edition ◽

10.4018/978-1-60566-010-3.ch300 ◽

2011 ◽

pp. 1966-1973

Author(s):

Chandan K. Reddy ◽

Bala Rajaratnam

Keyword(s):

Em Algorithm ◽

Expectation Maximization ◽

Latent Variables ◽

Likelihood Function ◽

Optimal Solution ◽

Local Maximum ◽

Search Space ◽

The Em Algorithm ◽

Log Likelihood ◽

Estimation Problems

In the field of statistical data mining, the Expectation Maximization (EM) algorithm is one of the most popular methods used for solving parameter estimation problems in the maximum likelihood (ML) framework. Compared to traditional methods such as steepest descent, conjugate gradient, or Newton-Raphson, which are often too complicated to use in solving these problems, EM has become a popular method because it takes advantage of some problem specific properties (Xu et al., 1996). The EM algorithm converges to the local maximum of the log-likelihood function under very general conditions (Demspter et al., 1977; Redner et al., 1984). Efficiently maximizing the likelihood by augmenting it with latent variables and guarantees of convergence are some of the important hallmarks of the EM algorithm. EM based methods have been applied successfully to solve a wide range of problems that arise in fields of pattern recognition, clustering, information retrieval, computer vision, bioinformatics (Reddy et al., 2006; Carson et al., 2002; Nigam et al., 2000), etc. Given an initial set of parameters, the EM algorithm can be implemented to compute parameter estimates that locally maximize the likelihood function of the data. In spite of its strong theoretical foundations, its wide applicability and important usage in solving some real-world problems, the standard EM algorithm suffers from certain fundamental drawbacks when used in practical settings. Some of the main difficulties of using the EM algorithm on a general log-likelihood surface are as follows (Reddy et al., 2008): • EM algorithm for mixture modeling converges to a local maximum of the log-likelihood function very quickly. • There are many other promising local optimal solutions in the close vicinity of the solutions obtained from the methods that provide good initial guesses of the solution. • Model selection criterion usually assumes that the global optimal solution of the log-likelihood function can be obtained. However, achieving this is computationally intractable. • Some regions in the search space do not contain any promising solutions. The promising and nonpromising regions co-exist and it becomes challenging to avoid wasting computational resources to search in non-promising regions. Of all the concerns mentioned above, the fact that most of the local maxima are not distributed uniformly makes it important to develop algorithms that not only help in avoiding some inefficient search over the lowlikelihood regions but also emphasize the importance of exploring promising subspaces more thoroughly (Zhang et al, 2004). This subspace search will also be useful for making the solution less sensitive to the initial set of parameters. In this chapter, we will discuss the theoretical aspects of the EM algorithm and demonstrate its use in obtaining the optimal estimates of the parameters for mixture models. We will also discuss some of the practical concerns of using the EM algorithm and present a few results on the performance of various algorithms that try to address these problems.

Download Full-text