scholarly journals Parametric Bootstrap Confidence Intervals for the Multivariate Fay–Herriot Model

Author(s):  
Takumi Saegusa ◽  
Shonosuke Sugasawa ◽  
Partha Lahiri

Abstract Various multivariate extensions to the well-known Fay–Herriot model have been proposed in the small area estimation literature. Such multivariate models are quite effective in combining information through correlations among small area survey estimates of related variables or historical survey estimates of the same variable or both. Though the literature on small area estimation is already very rich, construction of second-order efficient confidence intervals from multivariate models has received little attention. In this article, we develop a parametric bootstrap method for constructing a second-order efficient confidence interval for a general linear combination of small area means using the multivariate Fay–Herriot normal model. The proposed parametric bootstrap method replaces difficult and tedious analytical derivations by the power of efficient algorithm and high speed computer. Moreover, the proposed method is more versatile than the analytical method because the parametric bootstrap method can be easily applied to any method of model parameter estimation and any specific structure of the variance–covariance matrix of the multivariate Fay–Herriot model avoiding all the cumbersome and time-consuming calculations required in the analytical method. We apply our proposed methodology in constructing confidence intervals for the median income of four-person families for the fifty states and the District of Columbia in the United States. Our data analysis demonstrates that the proposed parametric bootstrap method, applied to both multivariate and univariate Fay–Herriot models, generally provides much shorter confidence intervals compared to the corresponding traditional direct method. Moreover, the confidence intervals obtained from the multivariate model are generally shorter than the corresponding intervals from the univariate model indicating the potential advantage of exploiting correlations of median income of four-person families with median incomes of three- and five-person families.

Author(s):  
Anggun Permatasari ◽  
Khairil Anwar Notodiputro ◽  
Erfiani *

Small area estimation (SAE) is an important alternative method to obtain information in a small area when the sample size is small. In this paper, we proposed a parametric bootstrap method to estimate mean square error (MSE) of proportion based on area unit levels. The purpose of this research has been focused on applying the parametric bootstrap method to estimate MSE in SAE for zero inflated binomial models (SAE ZIB). The results showed that the bootstrap method produced a smaller MSE than the direct estimation, implying that the SAE ZIB performs better when compared to the direct estimation


2021 ◽  
Vol 37 (4) ◽  
pp. 955-979
Author(s):  
Stefano Marchetti ◽  
Nikos Tzavidis

Abstract Small area estimation is receiving considerable attention due to the high demand for small area statistics. Small area estimators of means and totals have been widely studied in the literature. Moreover, in the last years also small area estimators of quantiles and poverty indicators have been studied. In contrast, small area estimators of inequality indicators, which are often used in socio-economic studies, have received less attention. In this article, we propose a robust method based on the M-quantile regression model for small area estimation of the Theil index and the Gini coefficient, two popular inequality measures. To estimate the mean squared error a non-parametric bootstrap is adopted. A robust approach is used because often inequality is measured using income or consumption data, which are often non-normal and affected by outliers. The proposed methodology is applied to income data to estimate the Theil index and the Gini coefficient for small domains in Tuscany (provinces by age groups), using survey and Census micro-data as auxiliary variables. In addition, a design-based simulation is carried out to study the behaviour of the proposed robust estimators. The performance of the bootstrap mean squared error estimator is also investigated in the simulation study.


Author(s):  
Jan Pablo Burgard ◽  
Domingo Morales ◽  
Anna-Lena Wölwer

AbstractSocioeconomic indicators play a crucial role in monitoring political actions over time and across regions. Income-based indicators such as the median income of sub-populations can provide information on the impact of measures, e.g., on poverty reduction. Regional information is usually published on an aggregated level. Due to small sample sizes, these regional aggregates are often associated with large standard errors or are missing if the region is unsampled or the estimate is simply not published. For example, if the median income of Hispanic or Latino Americans from the American Community Survey is of interest, some county-year combinations are not available. Therefore, a comparison of different counties or time-points is partly not possible. We propose a new predictor based on small area estimation techniques for aggregated data and bivariate modeling. This predictor provides empirical best predictions for the partially unavailable county-year combinations. We provide an analytical approximation to the mean squared error. The theoretical findings are backed up by a large-scale simulation study. Finally, we return to the problem of estimating the county-year estimates for the median income of Hispanic or Latino Americans and externally validate the estimates.


2019 ◽  
Author(s):  
David Buil-Gil ◽  
Reka Solymosi ◽  
Angelo Moretti

Open and crowdsourced data are becoming prominent in social sciences research. Crowdsourcing projects harness information from large crowds of citizens who voluntarily participate into one collaborative project, and allow new insights into people’s attitudes and perceptions. However, these are usually affected by a series of biases that limit their representativeness (i.e. self-selection bias, unequal participation, underrepresentation of certain areas and times). In this chapter we present a two-step method aimed to produce reliable small area estimates from crowdsourced data when no auxiliary information is available at the individual level. A non-parametric bootstrap, aimed to compute pseudosampling weights and bootstrap weighted estimates, is followed by an area-level model based small area estimation approach, which borrows strength from related areas based on a set of covariates, to improve the small area estimates. In order to assess the method, a simulation study and an application to safety perceptions in Greater London are conducted. The simulation study shows that the area-level model-based small area estimator under the non-parametric bootstrap improves (in terms of bias and variability) the small area estimates in the majority of areas. The application produces estimates of safety perceptions at a small geographical level in Greater London from Place Pulse 2.0 data. In the application, estimates are validated externally by comparing these to reliable survey estimates. Further simulation experiments and applications are needed to examine whether this method also improves the small area estimates when the sample biases are larger, smaller or show different distributions. A measure of reliability also needs to be developed to estimate the error of the small area estimates under the non-parametric bootstrap.


2018 ◽  
Author(s):  
Minh Cong Nguyen ◽  
Paul Corral ◽  
Joao Pedro Azevedo ◽  
Qinghua Zhao

Author(s):  
Benmei Liu ◽  
Isaac Dompreh ◽  
Anne M Hartman

Abstract Background The workplace and home are sources of exposure to secondhand smoke (SHS), a serious health hazard for nonsmoking adults and children. Smoke-free workplace policies and home rules protect nonsmoking individuals from SHS and help individuals who smoke to quit smoking. However, estimated population coverages of smoke-free workplace policies and home rules are not typically available at small geographic levels such as counties. Model-based small area estimation techniques are needed to produce such estimates. Methods Self-reported smoke-free workplace policies and home rules data came from the 2014-2015 Tobacco Use Supplement to the Current Population Survey. County-level design-based estimates of the two measures were computed and linked to county-level relevant covariates obtained from external sources. Hierarchical Bayesian models were then built and implemented through Markov Chain Monte Carlo methods. Results Model-based estimates of smoke-free workplace policies and home rules were produced for 3,134 (out of 3,143) U.S. counties. In 2014-2015, nearly 80% of U.S. adult workers were covered by smoke-free workplace policies, and more than 85% of U.S. adults were covered by smoke-free home rules. We found large variations within and between states in the coverage of smoke-free workplace policies and home rules. Conclusions The small-area modeling approach efficiently reduced the variability that was attributable to small sample size in the direct estimates for counties with data and predicted estimates for counties without data by borrowing strength from covariates and other counties with similar profiles. The county-level modeled estimates can serve as a useful resource for tobacco control research and intervention. Implications Detailed county- and state-level estimates of smoke-free workplace policies and home rules can help identify coverage disparities and differential impact of smoke-free legislation and related social norms. Moreover, this estimation framework can be useful for modeling different tobacco control variables and applied elsewhere, e.g., to other behavioral, policy, or health related topics.


1994 ◽  
Vol 9 (1) ◽  
pp. 90-93 ◽  
Author(s):  
M. Ghosh ◽  
J. N. K. Rao

Sign in / Sign up

Export Citation Format

Share Document