scholarly journals Sampling in Difficult Settings: A Simulation Study Comparing Several Sampling Methods

2020 ◽  
Author(s):  
Harry Shannon ◽  
Patrick D. Emond ◽  
Benjamin M. Bolker ◽  
Román Viveros-Aguilera

Abstract Background: Taking a representative sample to determine prevalence of variables like disease is difficult when little is known about the target population. Several methods have been proposed, including a recent revision of the World Health Organization’s Extended Program on Immunization (EPI) surveys. The original method uses probability proportional to size to sample towns and a nearest neighbour approach to sampling households within towns. The new version samples from relatively small areas and conducts a probability sample of households within those areas. Other techniques sample within towns from circles around randomly identified points (‘Circles’) or from randomly sampled squares in a superimposed grid (‘Square’). We compared these sampling methods in multiple virtual populations using computer simulation.Methods: We constructed 50 virtual populations with varying characteristics. Populations comprised about a million people across 300 towns. We created three more populations with different prevalences of disease but with uniform characteristics across each population. We created a binary exposure variable and allocated disease statuses to individuals assuming different Relative Risks of exposure. We simulated thirteen methods of sampling: simple random sampling; the original EPI method and variants; the Square and Circle methods; and the new EPI method. For each population, each sampling method, and each of three sample sizes per cluster (7, 15, and 30), we simulated 1,000 samples. For most sampling methods, the clusters were towns. We conducted simulations using the same 30 clusters and using a freshly-chosen set of clusters. For each simulation we estimated prevalence and RRs and computed the Root Mean Squared Error for the 1,000 samples.Results: The Circle and Square methods produced almost identical results, so we report only the Square method results. The Root Mean Squared Error for the Square method was almost universally best relative to simple random sampling for estimating prevalence, and generally best when estimating Relative Risks. The revised EPI approach was less good, but generally better than the original EPI. Conclusions: The Square method is recommended as statistically optimal, unless practical considerations favour another approach.

2021 ◽  
Author(s):  
Harry S. Shannon ◽  
Patrick D. Emond ◽  
Benjamin M. Bolker ◽  
Román Viveros-Aguilera

Abstract Background:Taking a representative sample to determine prevalence of variables like disease is difficult when little is known about the target population. Several methods have been proposed, including a recent revision of the World Health Organization’s Extended Program on Immunization (EPI) surveys. The original EPI method samples towns as Primary Sampling Units (PSUs) with probability proportional to size and uses a nearest neighbour approach to sample households within PSUs. The new version samples from smaller PSUs and conducts a probability sample of households within those PSUs. Other techniques use satellite images and Global Positioning Systems to sample within towns from circles around randomly identified points (‘Circle’ method) or from randomly sampled squares in a superimposed grid (‘Square’ method). We compared these sampling methods in multiple virtual populations using computer simulation.Methods:We constructed 50 virtual populations with varying characteristics. Populations comprised about a million people across 300 towns. We created three populations with different prevalences of disease but with uniform characteristics across each population. We created a binary exposure variable and allocated disease statuses to individuals assuming different relative risks (RRs) of exposure. We simulated thirteen methods of sampling: simple random sampling; the original EPI method and variants; the Square and Circle methods; and the new EPI method. For each population, each sampling method, and each of three sample sizes per PSU (7, 15, and 30), we simulated 1,000 samples. For most sampling methods, the PSUs were towns. We conducted simulations using the same 30 PSUs and using a freshly-chosen set of PSUs. For each simulation we estimated prevalence and RRs and combined the bias and variance of the 1,000 samples to compute the Root Mean Squared Error (RMSE).Results: The Circle and Square methods produced almost identical results, so we report only the Square method results. Apart from simple random sampling, the RMSE for the Square method was almost universally best for estimating prevalence, and generally best when estimating relative risks. The revised EPI approach was worse, but generally better than the original EPI. Conclusions:The Square method is recommended as statistically optimal, unless practical considerations favour another approach.


2022 ◽  
pp. 62-85
Author(s):  
Carlos N. Bouza-Herrera ◽  
Jose M. Sautto ◽  
Khalid Ul Islam Rather

This chapter introduced basic elements on stratified simple random sampling (SSRS) on ranked set sampling (RSS). The chapter extends Singh et al. results to sampling a stratified population. The mean squared error (MSE) is derived. SRS is used independently for selecting the samples from the strata. The chapter extends Singh et al. results under the RSS design. They are used for developing the estimation in a stratified population. RSS is used for drawing the samples independently from the strata. The bias and mean squared error (MSE) of the developed estimators are derived. A comparison between the biases and MSEs obtained for the sampling designs SRS and RSS is made. Under mild conditions the comparisons sustained that each RSS model is better than its SRS alternative.


2014 ◽  
Vol 1 ◽  
pp. 15-21
Author(s):  
H.S. Jhajj ◽  
Kusam Lata

Using auxiliary information, a family of difference-cum-exponential type estimators for estimating the population variance of variable under study have been proposed under double sampling design. Expressions for bias, mean squared error and its minimum values have been obtained. The comparisons have been made with the regression-type estimator by using simple random sampling at both occasions in double sampling design. It has also been shown that better estimators can be obtained from the proposed family of estimators which are more efficient than the linear regression type estimator. Results have also been illustrated numerically as well asgraphically.


PLoS ONE ◽  
2021 ◽  
Vol 16 (5) ◽  
pp. e0246947
Author(s):  
Sohail Ahmad ◽  
Muhammad Arslan ◽  
Aamna Khan ◽  
Javid Shabbir

In this paper, we propose a generalized class of exponential type estimators for estimating the finite population mean using two auxiliary attributes under simple random sampling and stratified random sampling. The bias and mean squared error (MSE) of the proposed class of estimators are derived up to first order of approximation. Both empirical study and theoretical comparisons are discussed. Four populations are used to support the theoretical findings. It is observed that the proposed class of estimators perform better as compared to all other considered estimator in simple and stratified random sampling.


2017 ◽  
Vol 1 ◽  
pp. 1-14
Author(s):  
Subramani Jambulingam ◽  
Ajith S. Master

Introduction: In sampling theory, different procedures are used to obtain the efficient estimator of the population mean. The commonly used method is to obtain the estimator of the population mean is simple random sampling without replacement when there is no auxiliary variable is available. There are methods that use auxiliary information of the study characteristics. If the auxiliary variable is correlated with study variable, number of estimators are widely available in the literature.Objective: This study deals with a new ratio cum product estimator is developed for the estimation of population mean of the study variable with the known median of the auxiliary variable in simple random sampling.Materials and Methods: The bias and mean squared error of proposed estimator are derived and compared with that of the existing estimators by analytically and numerically.Results: The proposed estimator is less biased and mean squared error is less than that of the existing estimators and from the numerical study, under some known natural populations, the bias of proposed estimator is approximately zero and the mean squared error ranged from 6.83 to 66429.21 and percentage relative efficiencies ranged from 103.65 to 2858.75.Conclusion: The proposed estimator under optimum conditions is almost unbiased and performs better than all other existing estimators.Nepalese Journal of Statistics, 2017, Vol. 1, 1-14


2021 ◽  
Vol 2021 ◽  
pp. 1-14
Author(s):  
Muhammad Irfan ◽  
Maria Javed ◽  
Sandile C. Shongwe ◽  
Muhammad Zohaib ◽  
Sajjad Haider Bhatti

In this paper, a generalized class of estimators for the estimation of population median are proposed under simple random sampling without replacement (SRSWOR) through robust measures of the auxiliary variable. Three robust measures, decile mean, Hodges–Lehmann estimator, and trimean of an auxiliary variable, are used. Mathematical properties of the proposed estimators such as bias, mean squared error (MSE), and minimum MSE are derived up to first order of approximation. We considered various real-life datasets and a simulation study to check the potentiality of the proposed estimators over the competitors. Robustness is also examined through a real dataset. Based on the fascinating results, the researchers are encouraged to use the proposed estimators for population median under SRSWOR.


2021 ◽  
Author(s):  
Nefel Tellioglu ◽  
Rebecca H. Chisholm ◽  
Jodie McVernon ◽  
Nicholas Geard ◽  
Patricia T. Campbell

Background Estimating scabies prevalence in communities is crucial for identifying the communities with high scabies prevalence and guiding interventions. There is no standardisation of sampling strategies to estimate scabies prevalence in communities, and a wide range of sampling sizes and methods have been used. The World Health Organization recommends household sampling or, as an alternative, school sampling to estimate community-level prevalence. Due to varying prevalence across populations, there is a need to understand how sampling strategies for estimating scabies prevalence interact with scabies epidemiology to affect accuracy of prevalence estimates. Methods We used a simulation-based approach to compare the efficacy of different sampling methods and sizes. First, we generate synthetic populations with Australian Indigenous communities' characteristics and then, assign a scabies status to individuals to achieve a specified prevalence using different assumptions about scabies epidemiology (random, age-specific, household-specific, or age-and-household-specific transmissions). Second, we calculate an observed prevalence for different sampling methods (household-based, school-based or random sampling) and sizes. Results The distribution of prevalence in population groups can vary substantially when the underlying scabies assignment method changes. For example, age-specific scabies assignment increases the prevalence among children as well as prevalence in larger households. Household specific assignment approaches introduce higher variance in prevalence among households. Across all of the scabies assignment methods combined, the simple random sampling method produces the narrowest 95% confidence interval for all sampling percentages. The household sampling method introduces higher variance compared to simple random sampling when the assignment of scabies includes a household-specific component. The school sampling method overestimates community prevalence when the assignment of scabies includes an age-specific component. Discussion Our results indicate that there are interactions between transmission assumptions and surveillance strategies, emphasizing the need for understanding scabies transmission dynamics. We suggest using the simple random sampling method for estimating scabies prevalence. Our approach can be adapted to various populations and diseases.


2012 ◽  
Vol 61 (2) ◽  
pp. 277-290 ◽  
Author(s):  
Ádám Csorba ◽  
Vince Láng ◽  
László Fenyvesi ◽  
Erika Michéli

Napjainkban egyre nagyobb igény mutatkozik olyan technológiák és módszerek kidolgozására és alkalmazására, melyek lehetővé teszik a gyors, költséghatékony és környezetbarát talajadat-felvételezést és kiértékelést. Ezeknek az igényeknek felel meg a reflektancia spektroszkópia, mely az elektromágneses spektrum látható (VIS) és közeli infravörös (NIR) tartományában (350–2500 nm) végzett reflektancia-mérésekre épül. Figyelembe véve, hogy a talajokról felvett reflektancia spektrum információban nagyon gazdag, és a vizsgált tartományban számos talajalkotó rendelkezik karakterisztikus spektrális „ujjlenyomattal”, egyetlen görbéből lehetővé válik nagyszámú, kulcsfontosságú talajparaméter egyidejű meghatározása. Dolgozatunkban, a reflektancia spektroszkópia alapjaira helyezett, a talajok ösz-szetételének meghatározását célzó módszertani fejlesztés első lépéseit mutatjuk be. Munkánk során talajok szervesszén- és CaCO3-tartalmának megbecslését lehetővé tévő többváltozós matematikai-statisztikai módszerekre (részleges legkisebb négyzetek módszere, partial least squares regression – PLSR) épülő prediktív modellek létrehozását és tesztelését végeztük el. A létrehozott modellek tesztelése során megállapítottuk, hogy az eljárás mindkét talajparaméter esetében magas R2értéket [R2(szerves szén) = 0,815; R2(CaCO3) = 0,907] adott. A becslés pontosságát jelző közepes négyzetes eltérés (root mean squared error – RMSE) érték mindkét paraméter esetében közepesnek mondható [RMSE (szerves szén) = 0,467; RMSE (CaCO3) = 3,508], mely a reflektancia mérési előírások standardizálásával jelentősen javítható. Vizsgálataink alapján arra a következtetésre jutottunk, hogy a reflektancia spektroszkópia és a többváltozós kemometriai eljárások együttes alkalmazásával, gyors és költséghatékony adatfelvételezési és -értékelési módszerhez juthatunk.


2021 ◽  
pp. 1-21
Author(s):  
Elsa Arrua-Duarte ◽  
Marta Migoya-Borja ◽  
Igor Barahona ◽  
Lena C. Quilty ◽  
Sakina J. Rizvi ◽  
...  

Abstract Objective: The Dimensional Anhedonia Rating Scale (DARS) is a novel questionnaire to assess anhedonia of recent validation. In this work we aim to study the equivalence between the traditional paper-and-pencil and the digital format of DARS. Methods: 69 patients filled the DARS in a paper-based and digital versions. We assessed differences between formats (Wilcoxon test), validity of the scales (Kappa and Intraclass Correlation Coefficients), and reliability (Cronbach’s alpha and Guttman’s coefficient). We calculated the Comparative Fit Index and the Root Mean Squared Error associated with the proposed one-factor structure. Results: Total scores were higher for paper-based format. Significant differences between both formats were found for three items. The weighted Kappa coefficient was approximately 0.40 for most of the items. Internal consistency was greater than 0.94, and the Intraclass Correlation Coefficient for the digital version was 0.95 and 0.94 for the paper-and-pencil version (F= 16.7, p < 0.001). Comparative Adjustment Index was 0.97 for the digital DARS and 0.97 for the paper-and-pencil DARS, and Root Mean Squared Error was 0.11 for the digital DARS and 0.10 for the paper-and-pencil DARS. Conclusion: The digital DARS is consistent in many respects to the paper-and-pencil questionnaire, but equivalence with this format cannot be assumed without caution.


2012 ◽  
Vol 10 (1) ◽  
pp. 40-43 ◽  
Author(s):  
S Aryal ◽  
A Badhu ◽  
S Pandey ◽  
A Bhandari ◽  
P Khatiwoda ◽  
...  

Background The patients suffering from tuberculosis are receiving shame and unfair treatment from the people living around them within their own society attending DOTS clinic of Dharan municipality. Objective To assess the stigma experienced by tuberculosis patients and to find out the association between stigma experienced by Tuberculosis patient and the selected variables (socio-demographic characteristics, clinical profile and illness experience). Methods Descriptive Cross Sectional study was done among sixty tuberculosis patients. Stratified random sampling was used to select the main center and sub center of Tuberculosis treatment and population proportionate simple random sampling using lottery method was done. Data was collected using predesigned, pretested performa from Explanatory Model Interview Catalogue developed by World Health Organization. Results The study revealed that 63.3% of the subjects were stigmatized. There was association between stigma and variables such as occupation, monthly family income and past history of Tuberculosis. There was also association of stigma with treatment phase, category of the patient and past outcome of illness. Conclusion Due to lack of knowledge and awareness about Tuberculosis, many patients were stigmatized. Efforts should be made to educate the public about Tuberculosis to reduce stigma experienced by Tuberculosis patients and improve the compliance of the patient. KATHMANDU UNIVERSITY MEDICAL JOURNAL  VOL.10 | NO. 1 | ISSUE 37 | JAN - MAR 2012 | 48-52 DOI: http://dx.doi.org/10.3126/kumj.v10i1.6914


Sign in / Sign up

Export Citation Format

Share Document