Ratio and regression estimation

2019 ◽  
pp. 104-139
Author(s):  
David G. Hankin ◽  
Michael S. Mohr ◽  
Ken B. Newman

Inexpensive and/or readily available auxiliary variable, x, values may often be available at little or no cost. If these variables are highly correlated with the target variable, y, then use of ratio or regression estimators may greatly reduce sampling variance. These estimators are not unbiased, but bias is generally small compared to the target of estimation and contributes a very small proportion of overall mean square error, the relevant measure of accuracy for biased estimators. Ratio estimation can also be incorporated in the context of stratified designs, again possibly offering a reduction in overall sampling variance. Model-based prediction offers an alternative to the design-based ratio and regression estimators and we present an overview of this approach. In model-based prediction, the y values associated with population units are viewed as realizations of random variables which are assumed to be related to auxiliary variables according to specified models. The realized values of the target variable are known for the sample, but must be predicted using an assumed model dependency on the auxiliary variable for the non-sampled units in the population. Insights from model-based thinking may assist the design-based sampling theorist in selection of an appropriate estimator. Similarly, we show that insights from design-based estimation may improve estimation of uncertainty in model-based mark-recapture estimation.

2019 ◽  
pp. 23-47
Author(s):  
David G. Hankin ◽  
Michael S. Mohr ◽  
Ken B. Newman

This chapter presents a formal quantitative treatment of material covered conceptually in Chapter 2, all with respect to equal probability with replacement (SWR) and without replacement selection simple random sampling, (SRS) of samples of size n from a finite population of size N. Small sample space examples are used to illustrate unbiasedness of mean-per-unit estimators of the mean, total and proportion of the target variable, y, for SWR and SRS. Explicit formulas for sampling variance indicate how estimator uncertainty depends on finite population variance, sample size and sampling fraction. Measures of the relative performance of alternative sampling strategies (relative precision, relative efficiency, net relative efficiency) are introduced and applied to mean-per-unit estimators used for the SWR and SRS selection methods. Normality of the sampling distribution of the SRS mean-per-unit estimator depends on sample size but also on the shape of the distribution of the target variable, y, values over the finite population units. Normality of the sampling distribution is required to justify construction of valid 95% confidence intervals that may be constructed around sample estimates based on unbiased estimates of sampling variance. Methods to calculate sample size to achieve accuracy objectives are presented. Additional topics include Bernoulli sampling (a without replacement selection scheme for which sample size is a random variable), the Rao–Blackwell theorem (which allows improvement of estimators that are based on selection methods which may result in repeated selection of the same units), oversampling and nonresponse.


Forests ◽  
2019 ◽  
Vol 10 (9) ◽  
pp. 800 ◽  
Author(s):  
Kangas ◽  
Räty ◽  
Korhonen ◽  
Vauhkonen ◽  
Packalen

Forest information is needed at global, national and local scales. This review aimed at providing insights of potential of national forest inventories (NFIs) as well as challenges they have to cater to those needs. Within NFIs, the authors address the methodological challenges introduced by the multitude of scales the forest data are needed, and the challenges in acknowledging the errors due to the measurements and models in addition to sampling errors. Between NFIs, the challenges related to the different harmonization tasks were reviewed. While a design-based approach is often considered more attractive than a model-based approach as it is guaranteed to provide unbiased results, the model-based approach is needed for downscaling the information to smaller scales and acknowledging the measurement and model errors. However, while a model-based inference is possible in small areas, the unknown random effects introduce biased estimators. The NFIs need to cater for the national information requirements and maintain the existing time series, while at the same time providing comparable information across the countries. In upscaling the NFI information to continental and global information needs, representative samples across the area are of utmost importance. Without representative data, the model-based approaches enable provision of forest information with unknown and indeterminable biases. Both design-based and model-based approaches need to be applied to cater to all information needs. This must be accomplished in a comprehensive way In particular, a need to have standardized quality requirements has been identified, acknowledging the possibility for bias and its implications, for all data used in policy making.


Author(s):  
Waqar Hafeez ◽  
Javid Shabbir ◽  
Muhammad Taqi Shah ◽  
Shakeel Ahmed

Researchers always appreciates estimators of finite population quantities, especially mean, with maximum efficiency for reaching to valid statistical inference.  Apart from ratio, product and regression estimators, exponential estimators are widely considered by survey statisticians. Motivated from the idea of exponential type estimators, in this article, we propose some new estimators utilizing known median of the study variable with mean of auxiliary variable. Theoretical properties of the suggested estimators are studied up to first order of approximation. In addition, an empirical and simulation study the comparison of median based proposed class of estimators with sample mean, ratio and linear regression estimators  are discussed. The results expose that the proposed estimators are more efficient than the existing estimators.


2021 ◽  
Vol 7 (1) ◽  
pp. 1035-1057
Author(s):  
Muhammad Nauman Akram ◽  
◽  
Muhammad Amin ◽  
Ahmed Elhassanein ◽  
Muhammad Aman Ullah ◽  
...  

<abstract> <p>The beta regression model has become a popular tool for assessing the relationships among chemical characteristics. In the BRM, when the explanatory variables are highly correlated, then the maximum likelihood estimator (MLE) does not provide reliable results. So, in this study, we propose a new modified beta ridge-type (MBRT) estimator for the BRM to reduce the effect of multicollinearity and improve the estimation. Initially, we show analytically that the new estimator outperforms the MLE as well as the other two well-known biased estimators i.e., beta ridge regression estimator (BRRE) and beta Liu estimator (BLE) using the matrix mean squared error (MMSE) and mean squared error (MSE) criteria. The performance of the MBRT estimator is assessed using a simulation study and an empirical application. Findings demonstrate that our proposed MBRT estimator outperforms the MLE, BRRE and BLE in fitting the BRM with correlated explanatory variables.</p> </abstract>


2019 ◽  
pp. 140-172
Author(s):  
David G. Hankin ◽  
Michael S. Mohr ◽  
Ken B. Newman

Equal probability selection is a special case of the general theory of probability sampling in which population units may be selected with unequal probabilities. Unequal selection probabilities are often based on auxiliary variable values which are measures of the sizes of population units, thus leading to the acronym (PPS)—“Probability Proportional to Size”. The Horvitz–Thompson (1953) theorem provides a unifying framework for design-based sampling theory. A sampling design specifies the sample space (set of all possible samples) and associated first and second order inclusion probabilities (probabilities that unit i, or units i and j, respectively, are included in a sample of size n selected from N according to some selection method). A valid probability sampling scheme must have all first order inclusion probabilities > 00 (i.e., every population unit must have a chance of being in the sample). Unbiased variance estimation is possible only for those schemes that guarantee that all second order inclusion probabilities exceed zero, thus providing theoretical justification for the absence of unbiased estimators of sampling variance in systematic sampling and other schemes for which some second order inclusion probabilities are zero. Numerous generalized Horvitz–Thompson (HT) estimators can be formed and all are consistent estimators because they are functions of consistent HT estimators. Unequal probability systematic sampling and Poisson sampling (the unequal probability counterpart to Bernoulli sampling for which sample size is a random variable) are also considered. Several R programs for selecting unequal probability samples and for calculating first and second order inclusion probabilities are posted at http://global.oup.com/uk/companion/hankin.


2020 ◽  
Author(s):  
Teng Zhang ◽  
Zhongjing Wang ◽  
Zixiong Zhang

&lt;p&gt;Runoff forecast with high precision is important for the efficient utilization of water resources and regional sustainable development, especially in the arid area. The monthly runoff of Changmabao (CMB) station has an upwards trend and an abrupt point in 1998. The impact factor analysis shows that it is highly correlated with the current precipitation and temperature in the wet season while the previous runoff and previous global land temperature in the dry season. Three models including the time-series decomposition model, the model based on teleconnection coupled with the support vector machine, and the model based on teleconnection coupled with the artificial neural network are used to predict the runoff of CMB station. An indicator &amp;#946; is constructed with the correlation coefficient (R) and mean relative deviation (rBias) to evaluate the model performance more conveniently and intuitively. The results suggest that the model based on teleconnection coupled with the support vector machine preforms best. This forecasting method could be applied to the management and dispatch of water resources in arid areas.&lt;/p&gt;


2019 ◽  
Vol 11 (1) ◽  
pp. 15-22
Author(s):  
S. Kumar ◽  
B. V. S. Sisodia

In the present paper, a model based calibration estimator of population total has been developed when study variable y and auxiliary variable x are inversely related. The relative performance of the proposed model based calibration estimator in comparison to model based estimator, the usual regression estimator and calibration based regression estimator have been examined by conducting a limited simulation study. In view of the results of the simulation study, it has been found that model based calibration estimator has outperformed the other estimators. However, calibration based regression estimator was found to be close to the model based calibration estimator.  


2014 ◽  
Vol 44 (8) ◽  
pp. 892-902 ◽  
Author(s):  
Piermaria Corona ◽  
Gherardo Chirici ◽  
Sara Franceschi ◽  
Daniela Maffei ◽  
Marzia Marcheselli ◽  
...  

Nonresponse is often a problem in forest inventories. It may arise when sample plots are inaccessible because of hazardous terrain. To account for this problem, the use of nonresponse calibration weighting is considered in a complete design-based framework, i.e., both nonresponse and survey variables are viewed as fixed characteristics of the plots. Information derived from remotely sensed data is exploited to compensate for the missing plots. Calibration is performed adopting canopy height from airborne laser scanning as an auxiliary variable. Conditions for approximate unbiasedness of the calibration estimator in two-phase inventories are derived, and some estimators of the sampling variance are proposed. Results from one-phase inventories are achieved as a particular case. Dummy variables are introduced in the presence of different forest types. Monte Carlo results support the validity of the procedure. An application to a forest survey carried out in Central Italy is performed.


Author(s):  
Hani M. Samawi ◽  
Ahmed Y.A. Al-Samarraie ◽  
Obaid M. Al-Saidy

Regression is used to estimate the population mean of the response variable, , in the two cases where the population mean of the concomitant (auxiliary) variable, , is known and where it is unknown. In the latter case, a double sampling method is used to estimate the population mean of the concomitant variable. We invesitagate the performance of the two methods using extreme ranked set sampling (ERSS), as discussed by Samawi et al. (1996). Theoretical and Monte Carlo evaluation results as well as an illustration using actual data are presented. The results show that if the underlying joint distribution of and  is symmetric, then using ERSS to obtain regression estimates is more efficient than using ranked set sampling (RSS) or  simple random sampling (SRS).  


2014 ◽  
Vol 44 (1) ◽  
pp. 33-46
Author(s):  
Jehad Al-Jararha ◽  
Ala' Bataineh

The estimation of the population total $t_y,$ by using one or moreauxiliary variables, and the population ratio $\theta_{xy}=t_y/t_x,$$t_x$ is the population total for the auxiliary variable $X$, for afinite population are heavily discussed in the literature. In thispaper, the idea of estimation the finite population ratio$\theta_{xy}$ is extended to use the availability of auxiliaryvariable $Z$ in the study, such auxiliary variable  is not used inthe definition of the population ratio. This idea may be  supported by the fact that the variable $Z$  is highly correlated with the interest variable $Y$ than the correlation between the variables $X$ and $Y.$ The availability of such auxiliary variable can be used to improve the precision of the estimation of the population ratio.  To our knowledge, this idea is not discussed in the literature.  The bias, variance and the mean squares error  are given for our approach. Simulation from real data set,  the empirical relative bias and  the empirical relative mean squares error are computed for our approach and different estimators proposed in the literature  for estimating the population ratio $\theta_{xy}.$ Analytically and the simulation results show that, by suitable choices, our approach gives negligible bias and has less mean squares error.  


Sign in / Sign up

Export Citation Format

Share Document