scholarly journals Model-based small area estimation methods and precise district-level HIV prevalence estimates in Uganda

PLoS ONE ◽  
2021 ◽  
Vol 16 (8) ◽  
pp. e0253375
Author(s):  
Joseph Ouma ◽  
Caroline Jeffery ◽  
Colletar Anna Awor ◽  
Allan Muruta ◽  
Joshua Musinguzi ◽  
...  

Background Model-based small area estimation methods can help generate parameter estimates at the district level, where planned population survey sample sizes are not large enough to support direct estimates of HIV prevalence with adequate precision. We computed district-level HIV prevalence estimates and their 95% confidence intervals for districts in Uganda. Methods Our analysis used direct survey and model-based estimation methods, including Fay-Herriot (area-level) and Battese-Harter-Fuller (unit-level) small area models. We used regression analysis to assess for consistency in estimating HIV prevalence. We use a ratio analysis of the mean square error and the coefficient of variation of the estimates to evaluate precision. The models were applied to Uganda Population-Based HIV Impact Assessment 2016/2017 data with auxiliary information from the 2016 Lot Quality Assurance Sampling survey and antenatal care data from district health information system datasets for unit-level and area-level models, respectively. Results Estimates from the model-based and the direct survey methods were similar. However, direct survey estimates were unstable compared with the model-based estimates. Area-level model estimates were more stable than unit-level model estimates. The correlation between unit-level and direct survey estimates was (β1 = 0.66, r2 = 0.862), and correlation between area-level model and direct survey estimates was (β1 = 0.44, r2 = 0.698). The error associated with the estimates decreased by 37.5% and 33.1% for the unit-level and area-level models, respectively, compared to the direct survey estimates. Conclusions Although the unit-level model estimates were less precise than the area-level model estimates, they were highly correlated with the direct survey estimates and had less standard error associated with estimates than the area-level model. Unit-level models provide more accurate and reliable data to support local decision-making when unit-level auxiliary information is available.

1996 ◽  
Vol 26 (5) ◽  
pp. 758-766 ◽  
Author(s):  
Annika Kangas

In small areas, the number of sample plots is usually small, and the classical estimators have a large variance. Information from nearby areas can be utilized to improve the subarea estimates using either nonparametric or parametric models. In this study, a number of model-based estimators for small-area estimation are presented. To illustrate the presented methods a numerical example in a real inventory situation is given. The auxiliary information used in this study is pure coordinate information, but the methods are applicable also for other kinds of auxiliary information. The object of this study is to compare the features of the presented small-area estimation methods and to discuss the applicability of these methods in different situations.


2020 ◽  
Vol 36 (4) ◽  
pp. 955-961
Author(s):  
Rizky Zulkarnain ◽  
Dwi Jayanti ◽  
Tri Listianingrum

The increasing needs for more disaggregated data motivates National Statistical Offices (NSOs) to develop efficient methods for producing official statistics without compromising on quality. In Indonesia, regional autonomy requires that Sustainable Development Goals (SDGs) indicators are available up to the district level. However, several surveys such as the Indonesian Demographic and Health Survey produce estimates up to the provincial level only. This generates gaps in support for district level policies. Small area estimation (SAE) techniques are often considered as alternatives for overcoming this issue. SAE enables more reliable estimation of the small areas by utilizing auxiliary information from other sources. However, the standard SAE approach has limitations in estimating non-sampled areas. This paper introduces an approach to estimating the non-sampled area random effect by utilizing cluster information. This model is demonstrated via the estimation of contraception prevalence rates at district levels in North Sumatera province. The results showed that small area estimates considering cluster information (SAE-cluster) produce more precise estimates than the direct method. The SAE-cluster approach revises the direct estimates upward or downward. This approach has important implications for improving the quality of disaggregated SDGs indicators without increasing cost. The paper was prepared under the kind mentorship of Professor James J. Cochran, Associate Dean for Research, Prof. of Statistics and Operations Research, University of Alabama.


PLoS ONE ◽  
2017 ◽  
Vol 12 (12) ◽  
pp. e0189401 ◽  
Author(s):  
Francisco Mauro ◽  
Vicente J. Monleon ◽  
Hailemariam Temesgen ◽  
Kevin R. Ford

OALib ◽  
2021 ◽  
Vol 08 (04) ◽  
pp. 1-7
Author(s):  
James Karangwa ◽  
Anshu Bharadwaj

2018 ◽  
Vol 34 (1) ◽  
pp. 181-209 ◽  
Author(s):  
Thomas Suesse ◽  
Ray Chambers

Abstract Model-based and model-assisted methods of survey estimation aim to improve the precision of estimators of the population total or mean relative to methods based on the nonparametric Horvitz-Thompson estimator. These methods often use a linear regression model defined in terms of auxiliary variables whose values are assumed known for all population units. Information on networks represents another form of auxiliary information that might increase the precision of these estimators, particularly if it is reasonable to assume that networked population units have similar values of the survey variable. Linear models that use networks as a source of auxiliary information include autocorrelation, disturbance, and contextual models. In this article we focus on social networks, and investigate how much of the population structure of the network needs to be known for estimation methods based on these models to be useful. In particular, we use simulation to compare the performance of the best linear unbiased predictor under a model that ignores the network with model-based estimators that incorporate network information. Our results show that incorporating network information via a contextual model seems to be the most appropriate approach. We also show that one does not need to know the full population network, but that knowledge of the partial network linking the sampled population units to the non-sampled population units is necessary. Finally, we also provide an estimator for the mean-squared error to make an informed decision about using the contextual information, as well as the results showing that this adaptive strategy leads to higher precision.


2017 ◽  
Vol 43 (2) ◽  
pp. 182-224
Author(s):  
Wendy Chan

Policymakers have grown increasingly interested in how experimental results may generalize to a larger population. However, recently developed propensity score–based methods are limited by small sample sizes, where the experimental study is generalized to a population that is at least 20 times larger. This is particularly problematic for methods such as subclassification by propensity score, where limited sample sizes lead to sparse strata. This article explores the potential of small area estimation methods to improve the precision of estimators in sparse strata using population data as a source of auxiliary information to borrow strength. Results from simulation studies identify the conditions under which small area estimators outperform conventional estimators and the limitations of this application to causal generalization studies.


2011 ◽  
Vol 41 (6) ◽  
pp. 1189-1201 ◽  
Author(s):  
Michael E. Goerndt ◽  
Vicente J. Monleon ◽  
Hailemariam Temesgen

One of the challenges often faced in forestry is the estimation of forest attributes for smaller areas of interest within a larger population. Small-area estimation (SAE) is a set of techniques well suited to estimation of forest attributes for small areas in which the existing sample size is small and auxiliary information is available. Selected SAE methods were compared for estimating a variety of forest attributes for small areas using ground data and light detection and ranging (LiDAR) derived auxiliary information. The small areas of interest consisted of delineated stands within a larger forested population. Four different estimation methods were compared for predicting forest density (number of trees/ha), quadratic mean diameter (cm), basal area (m2/ha), top height (m), and cubic stem volume (m3/ha). The precision and bias of the estimation methods (synthetic prediction (SP), multiple linear regression based composite prediction (CP), empirical best linear unbiased prediction (EBLUP) via Fay–Herriot models, and most similar neighbor (MSN) imputation) are documented. For the indirect estimators, MSN was superior to SP in terms of both precision and bias for all attributes. For the composite estimators, EBLUP was generally superior to direct estimation (DE) and CP, with the exception of forest density.


2019 ◽  
Author(s):  
David Buil-Gil ◽  
Reka Solymosi ◽  
Angelo Moretti

Open and crowdsourced data are becoming prominent in social sciences research. Crowdsourcing projects harness information from large crowds of citizens who voluntarily participate into one collaborative project, and allow new insights into people’s attitudes and perceptions. However, these are usually affected by a series of biases that limit their representativeness (i.e. self-selection bias, unequal participation, underrepresentation of certain areas and times). In this chapter we present a two-step method aimed to produce reliable small area estimates from crowdsourced data when no auxiliary information is available at the individual level. A non-parametric bootstrap, aimed to compute pseudosampling weights and bootstrap weighted estimates, is followed by an area-level model based small area estimation approach, which borrows strength from related areas based on a set of covariates, to improve the small area estimates. In order to assess the method, a simulation study and an application to safety perceptions in Greater London are conducted. The simulation study shows that the area-level model-based small area estimator under the non-parametric bootstrap improves (in terms of bias and variability) the small area estimates in the majority of areas. The application produces estimates of safety perceptions at a small geographical level in Greater London from Place Pulse 2.0 data. In the application, estimates are validated externally by comparing these to reliable survey estimates. Further simulation experiments and applications are needed to examine whether this method also improves the small area estimates when the sample biases are larger, smaller or show different distributions. A measure of reliability also needs to be developed to estimate the error of the small area estimates under the non-parametric bootstrap.


2019 ◽  
Vol 93 (3) ◽  
pp. 444-457
Author(s):  
P Corey Green ◽  
Harold E Burkhart ◽  
John W Coulston ◽  
Philip J Radtke

Abstract Loblolly pine (Pinus taeda L.) is one of the most widely planted tree species globally. As the reliability of estimating forest characteristics such as volume, biomass and carbon becomes more important, the necessary resources available for assessment are often insufficient to meet desired confidence levels. Small area estimation (SAE) methods were investigated for their potential to improve the precision of volume estimates in loblolly pine plantations aged 9–43. Area-level SAE models that included lidar height percentiles and stand thinning status as auxiliary information were developed to test whether precision gains could be achieved. Models that utilized both forms of auxiliary data provided larger gains in precision compared to using lidar alone. Unit-level SAE models were found to offer additional gains compared with area-level models in some cases; however, area-level models that incorporated both lidar and thinning status performed nearly as well or better. Despite their potential gains in precision, unit-level models are more difficult to apply in practice due to the need for highly accurate, spatially defined sample units and the inability to incorporate certain area-level covariates. The results of this study are of interest to those looking to reduce the uncertainty of stand parameter estimates. With improved estimate precision, managers, stakeholders and policy makers can have more confidence in resource assessments for informed decisions.


Sign in / Sign up

Export Citation Format

Share Document