scholarly journals Detecting influential observations in a model-based cluster analysis

2016 ◽  
Vol 27 (2) ◽  
pp. 521-540 ◽  
Author(s):  
Liesbeth Bruckers ◽  
Geert Molenberghs ◽  
Geert Verbeke ◽  
Helena Geys

Finite mixture models have been used to model population heterogeneity and to relax distributional assumptions. These models are also convenient tools for clustering and classification of complex data such as, for example, repeated-measurements data. The performance of model-based clustering algorithms is sensitive to influential and outlying observations. Methods for identifying outliers in a finite mixture model have been described in the literature. Approaches to identify influential observations are less common. In this paper, we apply local-influence diagnostics to a finite mixture model with known number of components. The methodology is illustrated on real-life data.

2007 ◽  
Vol 26 (5) ◽  
pp. 696-711 ◽  
Author(s):  
J. Tohka ◽  
E. Krestyannikov ◽  
I.D. Dinov ◽  
A.M. Graham ◽  
D.W. Shattuck ◽  
...  

2006 ◽  
Vol 9 (3) ◽  
pp. 412-423 ◽  
Author(s):  
Nathan A. Gillespie ◽  
Michael C. Neale

AbstractApproaches such as DeFries-Fulker extremes regression (LaBuda et al., 1986) are commonly used in genetically informative studies to assess whether familial resemblance varies as a function of the scores of pairs of twins. While useful for detecting such effects, formal modeling of differences in variance components as a function of pairs' trait scores is rarely attempted. We therefore present a finite mixture model which specifies that the population consists of latent groups which may differ in (i) their means, and (ii) the relative impact of genetic and environmental factors on within-group variation and covariation. This model may be considered as a special case of a factor mixture model, which combines the features of a latent class model with those of a latent trait model. Various models for the class membership of twin pairs may be employed, including additive genetic, common environment, specific environment or major locus (QTL) factors. Simulation results based on variance components derived from Turkheimer and colleagues (2003), illustrate the impact of factors such as the difference in group means and variance components on the feasibility of correctly estimating the parameters of the mixture model. Model-fitting analyses estimated group heritability as .49, which is significantly greater than heritability for the rest of the population in early childhood. These results suggest that factor mixture modeling is sufficiently robust for detecting heterogeneous populations even when group mean differences are modest.


2018 ◽  
Vol 2018 ◽  
pp. 1-17 ◽  
Author(s):  
Yi Zhou ◽  
Hongqing Zhu

Finite mixture model (FMM) is being increasingly used for unsupervised image segmentation. In this paper, a new finite mixture model based on a combination of generalized Gamma and Gaussian distributions using a trimmed likelihood estimator (GGMM-TLE) is proposed. GGMM-TLE combines the effectiveness of Gaussian distribution with the asymmetric capability of generalized Gamma distribution to provide superior flexibility for describing different shapes of observation data. Another advantage is that we consider the spatial information among neighbouring pixels by introducing Markov random field (MRF); thus, the proposed mixture model remains sufficiently robust with respect to different types and levels of noise. Moreover, this paper presents a new component-based confidence level ordering trimmed likelihood estimator, with a simple form, allowing GGMM-TLE to estimate the parameters after discarding the outliers. Thus, the proposed algorithm can effectively eliminate the disturbance of outliers. Furthermore, the paper proves the identifiability of the proposed mixture model in theory to guarantee that the parameter estimation procedures are well defined. Finally, an expectation maximization (EM) algorithm is included to estimate the parameters of GGMM-TLE by maximizing the log-likelihood function. Experiments on multiple public datasets demonstrate that GGMM-TLE achieves a superior performance compared with several existing methods in image segmentation tasks.


Sign in / Sign up

Export Citation Format

Share Document