scholarly journals Kernel Density Estimation: Theory and Application in Discriminant Analysis

2016 ◽  
Vol 33 (3) ◽  
pp. 267-279 ◽  
Author(s):  
Thomas Ledl

Nowadays, one can find a huge set of methods to estimate the density function of a random variable nonparametrically. Since the first version of the most elementary nonparametric density estimator (the histogram) researchers produced a vast amount of ideas especially corresponding to the issue of choosing the bandwidth parameter in a kernel density estimator model. To focus not only on a descriptive application, the model seems to be quite suitable for application in discriminant analysis, where (multivariate) class densities are the basis for the assignment of a vector to a given class. Thisarticle gives insight to most popular bandwidth parameter selectors as well as to the performance of the kernel density estimator as a classification method compared to the classical linear and quadratic discriminant analysis, respectively. Both a direct estimation in a multivariate space as well as an application of the concept to marginal normalizations of the single variables will be taken into consideration. From this report the gap between theory and application is going to be pointed out.

2011 ◽  
Vol 5 (2) ◽  
pp. 181-193 ◽  
Author(s):  
Qing Liu ◽  
David Pitt ◽  
Xibin Zhang ◽  
Xueyuan Wu

AbstractIn this paper, we present a Markov chain Monte Carlo (MCMC) simulation algorithm for estimating parameters in the kernel density estimation of bivariate insurance claim data via transformations. Our data set consists of two types of auto insurance claim costs and exhibits a high-level of skewness in the marginal empirical distributions. Therefore, the kernel density estimator based on original data does not perform well. However, the density of the original data can be estimated through estimating the density of the transformed data using kernels. It is well known that the performance of a kernel density estimator is mainly determined by the bandwidth, and only in a minor way by the kernel. In the current literature, there have been some developments in the area of estimating densities based on transformed data, where bandwidth selection usually depends on pre-determined transformation parameters. Moreover, in the bivariate situation, the transformation parameters were estimated for each dimension individually. We use a Bayesian sampling algorithm and present a Metropolis-Hastings sampling procedure to sample the bandwidth and transformation parameters from their posterior density. Our contribution is to estimate the bandwidths and transformation parameters simultaneously within a Metropolis-Hastings sampling procedure. Moreover, we demonstrate that the correlation between the two dimensions is better captured through the bivariate density estimator based on transformed data.


2018 ◽  
Vol 6 (332) ◽  
pp. 73-86
Author(s):  
Aleksandra Katarzyna Baszczyńska

Ad hoc methods in the choice of smoothing parameter in kernel density estimation, al­though often used in practice due to their simplicity and hence the calculated efficiency, are char­acterized by quite big error. The value of the smoothing parameter chosen by Silverman method is close to optimal value only when the density function in population is the normal one. Therefore, this method is mainly used at the initial stage of determining a kernel estimator and can be used only as a starting point for further exploration of the smoothing parameter value. This paper pre­sents ad hoc methods for determining the smoothing parameter. Moreover, the interval of smooth­ing parameter values is proposed in the estimation of kernel density function. Basing on the results of simulation studies, the properties of smoothing parameter selection methods are discussed.


2020 ◽  
Vol 13 (9) ◽  
pp. 205
Author(s):  
Timothy Fortune ◽  
Hailin Sang

In this paper, we estimate the Shannon entropy S(f)=−E[log(f(x))] of a one-sided linear process with probability density function f(x). We employ the integral estimator Sn(f), which utilizes the standard kernel density estimator fn(x) of f(x). We show that Sn(f) converges to S(f) almost surely and in Ł2 under reasonable conditions.


2008 ◽  
Vol 23 (4) ◽  
pp. 575-595 ◽  
Author(s):  
Syd Peel ◽  
Laurence J. Wilson

Abstract Kernel density estimation is employed to fit smooth probabilistic models to precipitation forecasts of the Canadian ensemble prediction system. An intuitive nonparametric technique, kernel density estimation has become a powerful tool widely used in the approximation of probability density functions. The density estimators were constructed using the gamma kernels prescribed by S.-X. Chen, confined as they are to the nonnegative real axis, which constitutes the support of the random variable representing precipitation accumulation. Performance of kernel density estimators for several different smoothing bandwidths is compared with the discrete probabilistic model obtained as the fraction of member forecasts predicting the events, which for this study consisted of threshold exceedances. A propitious choice of the smoothing bandwidth yields smooth forecasts comparable, or sometimes superior, to the discrete probabilistic forecast, depending on the character of the raw ensemble forecasts. At the same time more realistic models of the probability density are achieved, particularly in the tail of the distribution, yielding forecasts that can be optimally calibrated for extreme events.


2021 ◽  
Vol 27 (1) ◽  
pp. 57-69
Author(s):  
Yasmina Ziane ◽  
Nabil Zougab ◽  
Smail Adjabi

Abstract In this paper, we consider the procedure for deriving variable bandwidth in univariate kernel density estimation for nonnegative heavy-tailed (HT) data. These procedures consider the Birnbaum–Saunders power-exponential (BS-PE) kernel estimator and the bayesian approach that treats the adaptive bandwidths. We adapt an algorithm that subdivides the HT data set into two regions, high density region (HDR) and low-density region (LDR), and we assign a bandwidth parameter for each region. They are derived by using a Monte Carlo Markov chain (MCMC) sampling algorithm. A series of simulation studies and real data are realized for evaluating the performance of a procedure proposed.


Author(s):  
Nicholas J. Cox

Density probability plots show two guesses at the density function of a continuous variable, given a data sample. The first guess is the density function of a specified distribution (e.g., normal, exponential, gamma, etc.) with appropriate parameter values plugged in. The second guess is the same density function evaluated at quantiles corresponding to plotting positions associated with the sample's order statistics. If the specified distribution fits well, the two guesses will be close. Such plots, suggested by Jones and Daly in 1995, are explained and discussed with examples from simulated and real data. Comparisons are made with histograms, kernel density estimation, and quantile–quantile plots.


2016 ◽  
Vol 37 (1) ◽  
Author(s):  
Eugenia Stoimenova

This paper is concerned with the nonparametric estimation of a density function when the data are incomplete due to interval censoring. The Nadaraya-Watson kernel density estimator is modified to allow description of such interval data. An interactive R application is developed to explore different estimates.


Sign in / Sign up

Export Citation Format

Share Document