scholarly journals Estimating the mean of a small sample under the two parameter lognormal distribution

2018 ◽  
Vol 1 ◽  
pp. 100
Author(s):  
Peter Hingley

Lognormally distributed variables are found in biological, economic and other systems. Here the sampling distributions of maximum likelihood estimates (MLE) for parameters are developed when data are lognormally distributed and estimation is carried out either by the correct lognormal model or by the mis-specified normal distribution. This is designed as an aid to experimental design when drawing a small sample under an assumption that the population follows a normal distribution while in fact it follows a lognormal distribution. Distributions are derived analytically as far as possible by using a technique for estimator densities and are confirmed by simulations. For an independently and identically distributed lognormal sample, when a normal distribution is used for estimation then the distribution of the MLE of the mean is different to that for the MLE of the lognormal mean. The distribution is not known but can be well enough approximated by another lognormal. An analytic method for the distribution of the mis-specified normal variance uses computational convolution for a sample of size 2. The expected value of the mis-specified normal variance is also found as a way to give information about the effect of the model misspecification on inferences for the mean. The results are demonstrated on an example for a population distribution that is abstracted from a survey.

2019 ◽  
Vol 4 (7) ◽  
pp. 130-142
Author(s):  
QETEVAN PIPIA

Due to economic, social, political and other differences, different sectors of society are subject to different laws of distribution. Among these laws are Pareto distribution, the normal distribution, the lognormal distribution, and so on. It is noteworthy that the higher, richer stratum of a society more often depends on the Pareto distribution. As for the poor and middle class, there was an attempt to build their model using a normal distribution. But later it turned out that more accurate results are provided by a lognormal distribution. The article attempts to build a model of the distribution of the upper layers of the population of Georgia in terms of per capita GDP consumption (according to the World Bank) using Pareto distribution. As for the other layers, due to the lack of data in GeoStat, when trying to build a model using a lognormal distribution, data on the population’s declared income are used, obtained from the Revenue Service of the Ministry of Finance of Georgia, hoping that this data correlates with the population distribution by GDP consumption.


2016 ◽  
Vol 27 (4) ◽  
pp. 1001-1023 ◽  
Author(s):  
Guo-Liang Tian ◽  
Chi Zhang ◽  
Xuejun Jiang

The main objective of this paper is to derive the valid sampling distribution of the observed counts in a case–control study with missing data under the assumption of missing at random by employing the conditional sampling method and the mechanism augmentation method. The proposed sampling distribution, called the case–control sampling distribution, can be used to calculate the standard errors of the maximum likelihood estimates of parameters via the Fisher information matrix and to generate independent samples for constructing small-sample bootstrap confidence intervals. Theoretical comparisons of the new case–control sampling distribution with two existing sampling distributions exhibit a large difference. Simulations are conducted to investigate the influence of the three different sampling distributions on statistical inferences. One finding is that the conclusion by the Wald test for testing independency under the two existing sampling distributions could be completely different (even contradictory) from the Wald test for testing the equality of the success probabilities in control/case groups under the proposed distribution. A real cervical cancer data set is used to illustrate the proposed statistical methods.


2009 ◽  
Vol 59 (5) ◽  
Author(s):  
Viktor Witkovský ◽  
Gejza Wimmer

AbstractWe consider the problem of making statistical inference about the mean of a normal distribution based on a random sample of quantized (digitized) observations. This problem arises, for example, in a measurement process with errors drawn from a normal distribution and with a measurement device or process with a known resolution, such as the resolution of an analog-to-digital converter or another digital instrument. In this paper we investigate the effect of quantization on subsequent statistical inference about the true mean. If the standard deviation of the measurement error is large with respect to the resolution of the indicating measurement device, the effect of quantization (digitization) diminishes and standard statistical inference is still valid. Hence, in this paper we consider situations where the standard deviation of the measurement error is relatively small. By Monte Carlo simulations we compare small sample properties of the interval estimators of the mean based on standard approach (i.e. by ignoring the fact that the measurements have been quantized) with some recently suggested methods, including the interval estimators based on maximum likelihood approach and the fiducial approach. The paper extends the original study by Hannig et al. (2007).


2021 ◽  
Vol 2084 (1) ◽  
pp. 012006
Author(s):  
Nurul Hafizah Azizan ◽  
Zamalia Mahmud ◽  
Adzhar Rambli

Abstract The focus of this article is to evaluate the maximum likelihood estimation (MLE) performance in estimating the person parameters in the Rasch rating scale model (RRSM). For that purpose, 1000 iterations of the Markov Chain Monte Carlo (MCMC) simulation technique were performed based on a different number of sample sizes and several number of items. The performance of MLE in estimating the person parameters according to the different number of sample sizes was compared through accuracy and bias measures. Root mean square error (RMSE) and mean absolute error (MAE) were used to examine the accuracy of the estimates, while bias in estimation was assessed through the mean difference of estimates and true values of the person parameters. The simulated survey data sets in this study were generated according to the RRSM under the assumption of normality was satisfied. Results from the simulation analysis showed that in comparison to the larger sample sizes, smaller sample sizes tend to produce higher RMSE and MAE. In addition, the maximum likelihood estimates of the person parameters in smaller sample sizes also recorded a higher value of the mean difference of the person estimates and its true values compared to larger sample sizes. Findings from this study imply that the use of the MLE approach in small sample sizes results in less accurate and highly biased person estimates across the number of items.


Entropy ◽  
2021 ◽  
Vol 23 (2) ◽  
pp. 186
Author(s):  
Xinyi Zeng ◽  
Wenhao Gui

In this paper, the parameter estimation problem of a truncated normal distribution is discussed based on the generalized progressive hybrid censored data. The desired maximum likelihood estimates of unknown quantities are firstly derived through the Newton–Raphson algorithm and the expectation maximization algorithm. Based on the asymptotic normality of the maximum likelihood estimators, we develop the asymptotic confidence intervals. The percentile bootstrap method is also employed in the case of the small sample size. Further, the Bayes estimates are evaluated under various loss functions like squared error, general entropy, and linex loss functions. Tierney and Kadane approximation, as well as the importance sampling approach, is applied to obtain the Bayesian estimates under proper prior distributions. The associated Bayesian credible intervals are constructed in the meantime. Extensive numerical simulations are implemented to compare the performance of different estimation methods. Finally, an authentic example is analyzed to illustrate the inference approaches.


2018 ◽  
Vol 934 (4) ◽  
pp. 59-62
Author(s):  
V.I. Salnikov

The question of calculating the limiting values of residuals in geodesic constructions is considered in the case when the limiting value for measurement errors is assumed equal to 3m, ie ∆рred = 3m, where m is the mean square error of the measurement. Larger errors are rejected. At present, the limiting value for the residual is calculated by the formula 3m√n, where n is the number of measurements. The article draws attention to two contradictions between theory and practice arising from the use of this formula. First, the formula is derived from the classical law of the normal Gaussian distribution, and it is applied to the truncated law of the normal distribution. And, secondly, as shown in [1], when ∆рred = 2m, the sums of errors naturally take the value equal to ?pred, after which the number of errors in the sum starts anew. This article establishes its validity for ∆рred = 3m. A table of comparative values of the tolerances valid and recommended for more stringent ones is given. The article gives a graph of applied and recommended tolerances for ∆рred = 3m.


2021 ◽  
pp. 001316442110203
Author(s):  
Lucia Guastadisegni ◽  
Silvia Cagnone ◽  
Irini Moustaki ◽  
Vassilis Vasdekis

This article studies the Type I error, false positive rates, and power of four versions of the Lagrange multiplier test to detect measurement noninvariance in item response theory (IRT) models for binary data under model misspecification. The tests considered are the Lagrange multiplier test computed with the Hessian and cross-product approach, the generalized Lagrange multiplier test and the generalized jackknife score test. The two model misspecifications are those of local dependence among items and nonnormal distribution of the latent variable. The power of the tests is computed in two ways, empirically through Monte Carlo simulation methods and asymptotically, using the asymptotic distribution of each test under the alternative hypothesis. The performance of these tests is evaluated by means of a simulation study. The results highlight that, under mild model misspecification, all tests have good performance while, under strong model misspecification, the tests performance deteriorates, especially for false positive rates under local dependence and power for small sample size under misspecification of the latent variable distribution. In general, the Lagrange multiplier test computed with the Hessian approach and the generalized Lagrange multiplier test have better performance in terms of false positive rates while the Lagrange multiplier test computed with the cross-product approach has the highest power for small sample sizes. The asymptotic power turns out to be a good alternative to the classic empirical power because it is less time consuming. The Lagrange tests studied here have been also applied to a real data set.


Sign in / Sign up

Export Citation Format

Share Document