Unequal probability sampling

2019 ◽  
pp. 140-172
Author(s):  
David G. Hankin ◽  
Michael S. Mohr ◽  
Ken B. Newman

Equal probability selection is a special case of the general theory of probability sampling in which population units may be selected with unequal probabilities. Unequal selection probabilities are often based on auxiliary variable values which are measures of the sizes of population units, thus leading to the acronym (PPS)—“Probability Proportional to Size”. The Horvitz–Thompson (1953) theorem provides a unifying framework for design-based sampling theory. A sampling design specifies the sample space (set of all possible samples) and associated first and second order inclusion probabilities (probabilities that unit i, or units i and j, respectively, are included in a sample of size n selected from N according to some selection method). A valid probability sampling scheme must have all first order inclusion probabilities > 00 (i.e., every population unit must have a chance of being in the sample). Unbiased variance estimation is possible only for those schemes that guarantee that all second order inclusion probabilities exceed zero, thus providing theoretical justification for the absence of unbiased estimators of sampling variance in systematic sampling and other schemes for which some second order inclusion probabilities are zero. Numerous generalized Horvitz–Thompson (HT) estimators can be formed and all are consistent estimators because they are functions of consistent HT estimators. Unequal probability systematic sampling and Poisson sampling (the unequal probability counterpart to Bernoulli sampling for which sample size is a random variable) are also considered. Several R programs for selecting unequal probability samples and for calculating first and second order inclusion probabilities are posted at http://global.oup.com/uk/companion/hankin.

2012 ◽  
Vol 6 (0) ◽  
pp. 1477-1489 ◽  
Author(s):  
Anton Grafström ◽  
Lionel Qualité ◽  
Yves Tillé ◽  
Alina Matei

2019 ◽  
pp. 48-67
Author(s):  
David G. Hankin ◽  
Michael S. Mohr ◽  
Ken B. Newman

In many contexts it is difficult or impossible to select a simple random sample. For example, the number of units in the finite population, N, may not be known in advance, or it may not be feasible to assign labels to all units in the population and to select an SRS from these labels (e.g., crabs within boxes on a fishing vessel). Instead, one may select a random start, r, on the integers 1 through k and then select that unit and every kth unit thereafter for inclusion in the sample. This selection method, called linear systematic sampling, results in an extremely restricted randomization—there are only k possible linear systematic samples—compared to the typically large number [N!/(N-n)!n!] of possible samples of size n that can be selected from N by SRS. If units are in random order, then linear systematic sampling with mean-per-unit estimation will have sampling variance comparable to SRS with mean-per-unit estimation. But if there is a trend of increase or decrease in unit-specific y value with unit label or location, then sampling variance of a mean-per-unit estimator for a linear systematic design may be substantially less than for an SRS design. Circular and fractional interval systematic sampling designs are also presented. The disadvantage of these systematic sampling designs is that the highly restricted randomizations generally rule out unbiased estimation of sampling variance from a single systematic sample. Several approaches for variance estimation are considered.


Sign in / Sign up

Export Citation Format

Share Document