scholarly journals Estimation and computations for Gaussian mixtures with uniform noise under separation constraints

Author(s):  
Pietro Coretto

AbstractIn this paper we study a finite Gaussian mixture model with an additional uniform component that has the role to catch points in the tails of the data distribution. An adaptive constraint enforces a certain level of separation between the Gaussian mixture components and the uniform component representing noise and outliers in the tail of the distribution. The latter makes the proposed tool particularly useful for robust estimation and outlier identification. A constrained ML estimator is introduced for which existence and consistency is shown. One of the attractive features of the methodology is that the noise level is estimated from data. We also develop an EM-type algorithm with proven convergence. Based on numerical evidence we show how the methods developed in this paper are useful for several fundamental data analysis tasks: outlier identification, robust location-scale estimation, clustering, and density estimation.

2003 ◽  
Vol 15 (2) ◽  
pp. 469-485 ◽  
Author(s):  
J. J. Verbeek ◽  
N. Vlassis ◽  
B. Kröse

This article concerns the greedy learning of gaussian mixtures. In the greedy approach, mixture components are inserted into the mixture one aftertheother.We propose a heuristic for searching for the optimal component to insert. In a randomized manner, a set of candidate new components is generated. For each of these candidates, we find the locally optimal new component and insert it into the existing mixture. The resulting algorithm resolves the sensitivity to initialization of state-of-the-art methods, like expectation maximization, and has running time linear in the number of data points and quadratic in the (final) number of mixture components. Due to its greedy nature, the algorithm can be particularly useful when the optimal number of mixture components is unknown. Experimental results comparing the proposed algorithm to other methods on density estimation and texture segmentation are provided.


2006 ◽  
Vol 18 (2) ◽  
pp. 430-445 ◽  
Author(s):  
Marc M. Van Hulle

We introduce a new unbiased metric for assessing the quality of density estimation based on gaussian mixtures, called differential log likelihood. As an application, we determine the optimal smoothness and the optimal number of kernels in gaussian mixtures. Furthermore, we suggest a learning strategy for gaussian mixture density estimation and compare its performance with log likelihood maximization for a wide range of real-world data sets.


Author(s):  
Patrik Puchert ◽  
Pedro Hermosilla ◽  
Tobias Ritschel ◽  
Timo Ropinski

AbstractDensity estimation plays a crucial role in many data analysis tasks, as it infers a continuous probability density function (PDF) from discrete samples. Thus, it is used in tasks as diverse as analyzing population data, spatial locations in 2D sensor readings, or reconstructing scenes from 3D scans. In this paper, we introduce a learned, data-driven deep density estimation (DDE) to infer PDFs in an accurate and efficient manner, while being independent of domain dimensionality or sample size. Furthermore, we do not require access to the original PDF during estimation, neither in parametric form, nor as priors, or in the form of many samples. This is enabled by training an unstructured convolutional neural network on an infinite stream of synthetic PDFs, as unbound amounts of synthetic training data generalize better across a deck of natural PDFs than any natural finite training data will do. Thus, we hope that our publicly available DDE method will be beneficial in many areas of data analysis, where continuous models are to be estimated from discrete observations.


1981 ◽  
Vol 10 (2) ◽  
pp. 165-185 ◽  
Author(s):  
Lawrence C. Hamilton

Exploratory data analysis (EDA) is used to study errors in self-reports of lest scores and grades from a survey sample of college students. Both response and non-response are found to be systematically biased, with unfortunate effects in combination. Errors are not normally distributed, and would be better modeled as contaminated distributions made up of two or more simple distributions. Errors are correlated with each other and with other variables, leading to spuriously inflated as well as deflated intervariable correlations. These findings may be typical of survey data in general; hence, more realistic error models and robust estimation methods are desirable.


Entropy ◽  
2020 ◽  
Vol 22 (2) ◽  
pp. 213 ◽  
Author(s):  
Yiğit Uğur ◽  
George Arvanitakis ◽  
Abdellatif Zaidi

In this paper, we develop an unsupervised generative clustering framework that combines the variational information bottleneck and the Gaussian mixture model. Specifically, in our approach, we use the variational information bottleneck method and model the latent space as a mixture of Gaussians. We derive a bound on the cost function of our model that generalizes the Evidence Lower Bound (ELBO) and provide a variational inference type algorithm that allows computing it. In the algorithm, the coders’ mappings are parametrized using neural networks, and the bound is approximated by Markov sampling and optimized with stochastic gradient descent. Numerical results on real datasets are provided to support the efficiency of our method.


Author(s):  
P. J. Green ◽  
A. H. Seheult ◽  
B. W. Silverman

Sign in / Sign up

Export Citation Format

Share Document