scholarly journals Datos agregados para corregir los sesgos de no respuesta y de cobertura en encuestas

Author(s):  
Pablo Cabrera-Álvarez

En las últimas décadas la incidencia creciente de los sesgos de no respuesta y cobertura en las encuestas han puesto en entredicho la capacidad de inferir los resultados a la población. Una forma extendida de corregir los sesgos de no respuesta y cobertura en las encuestas es el uso de ponderaciones que equilibran la muestra final de entrevistados. La construcción de ponderaciones requiere información auxiliar, totales poblacionales que estén disponibles para los que responden y para los que no cooperan. En este trabajo, a partir de simulaciones estadísticas, se comprueba la capacidad de la información agregada para corregir el sesgo de no respuesta. Para ello se comparan el ajuste con datos individuales y el sistema de datos agregados, dando como resultado que el uso de datos agregados puede ser útil si se cumplen tres requisitos: 1) la variable estimada está agrupada, 2) la variable estimada y la auxiliar están correlacionadas y 3) la probabilidad de completar la encuesta está relacionada con la variable auxiliar.In the last decades the effect of nonresponse and coverage bias in surveys have questioned the ability of inferring the results to the population. An extended procedure used to correct nonresponse and coverage problems is the use of weights to balance the sample of respondents. However auxiliary information available for respondents and nonrespondents is required to compute weights. In this paper statistical simulations are used to test the potential of aggregate data to correct nonresponse bias. This research compares individual data adjustments to the use of auxiliary aggregate data. The results show the use of aggregate data can improve survey representativity if three requirements are met: 1) the dependent variable is grouped, 2) the dependent and auxiliary variables are correlated and 3) the auxiliary variable is correlated with response propensities.

2020 ◽  
Vol 2020 ◽  
pp. 1-13 ◽  
Author(s):  
Saddam Hussain ◽  
Mi Zichuan ◽  
Sardar Hussain ◽  
Anum Iftikhar ◽  
Muhammad Asif ◽  
...  

In this paper, we proposed two new families of estimators using the supplementary information on the auxiliary variable and exponential function for the population distribution functions in case of nonresponse under simple random sampling. The estimations are done in two nonresponse scenarios. These are nonresponse on study variable and nonresponse on both study and auxiliary variables. As we have highlighted above that two new families of estimators are proposed, in the first family, the mean was used, while in the second family, ranks were used as auxiliary variables. Expression of biases and mean squared error of the proposed and existing estimators are obtained up to the first order of approximation. The performances of the proposed and existing estimators are compared theoretically. On these theoretical comparisons, we demonstrate that the proposed families of estimators are better in performance than the existing estimators available in the literature, under the obtained conditions. Furthermore, these theoretical findings are braced numerically by an empirical study offering the proposed relative efficiencies of the proposed families of estimators.


PLoS ONE ◽  
2020 ◽  
Vol 15 (12) ◽  
pp. e0243584
Author(s):  
Sardar Hussain ◽  
Sohaib Ahmad ◽  
Sohail Akhtar ◽  
Amara Javed ◽  
Uzma Yasmeen

In this paper, we propose two new families of estimators for estimating the finite population distribution function in the presence of non-response under simple random sampling. The proposed estimators require information on the sample distribution functions of the study and auxiliary variables, and additional information on either sample mean or ranks of the auxiliary variable. We considered two situations of non-response (i) non-response on both study and auxiliary variables, (ii) non-response occurs only on the study variable. The performance of the proposed estimators are compared with the existing estimators available in the literature, both theoretically and numerically. It is also observed that proposed estimators are more precise than the adapted distribution function estimators in terms of the percentage relative efficiency.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
George Vamvakas ◽  
Courtenay Norbury ◽  
Andrew Pickles

Abstract Background The use of auxiliary variables with maximum likelihood parameter estimation for surveys that miss data by design is not a widespread approach, despite its documented improved efficiency over traditional approaches that deploy sampling weights. Although efficiency gains from the use of Normally distributed auxiliary variables in a model have been recorded in the literature, little is known about the effects of non-Normal auxiliary variables in the parameter estimation. Methods We simulate growth data to mimic SCALES, a two-stage survey of language development with a screening phase (stage one) for which data are observed for the whole sample and an intensive assessments phase (stage two), for which data are observed for a sub-sample, selected using stratified random sampling. In the simulation, we allow a fully observed Poisson distributed stratification criterion to be correlated with the partially observed model responses and develop five generalised structural equation growth models that host the auxiliary information from this criterion. We compare these models with each other and with a weighted growth model in terms of bias, efficiency, and coverage. We finally apply our best performing model to SCALES data and show how to obtain growth parameters and population norms. Results Parameter estimation from a model that incorporates a non-Normal auxiliary variable is unbiased and more efficient than its weighted counterpart. The auxiliary variable method is capable of producing efficient population percentile norms and velocities. Conclusions The deployment of a fully observed variable that dominates the selection of the sample and correlates strongly with the incomplete variable of interest appears beneficial for the estimation process.


Symmetry ◽  
2019 ◽  
Vol 12 (1) ◽  
pp. 16 ◽  
Author(s):  
Farah Naz ◽  
Tahir Nawaz ◽  
Tianxiao Pang ◽  
Muhammad Abid

The use of auxiliary information in survey sampling to enhance the efficiency of the estimators of population parameters is a common phenomenon. Generally, the ratio and regression estimators are developed by using the known information on conventional parameters of the auxiliary variables, such as variance, coefficient of variation, coefficient of skewness, coefficient of kurtosis, or correlation between the study and auxiliary variable. The efficiency of these estimators is dubious in the presence of outliers in the data and a nonsymmetrical population. This study presents improved variance estimators under simple random sampling without replacement with the assumption that the information on some nonconventional dispersion measures of the auxiliary variable is readily available. These auxiliary variables can be the inter-decile range, sample inter-quartile range, probability-weighted moment estimator, Gini mean difference estimator, Downton’s estimator, median absolute deviation from the median, and so forth. The algebraic expressions for the bias and mean square error of the proposed estimators are obtained and the efficiency conditions are derived to compare with the existing estimators. The percentage relative efficiencies are used to numerically compare the results of the proposed estimators with the existing estimators by using real datasets, indicating the supremacy of the suggested estimators.


Author(s):  
R. Beyrouti ◽  
J. G. Best ◽  
A. Chandratheva ◽  
R. J. Perry ◽  
D. J. Werring

Abstract Background and purpose There are very few studies of the characteristics and causes of ICH in COVID-19, yet such data are essential to guide clinicians in clinical management, including challenging anticoagulation decisions. We aimed to describe the characteristics of spontaneous symptomatic intracerebral haemorrhage (ICH) associated with COVID-19. Methods We systematically searched PubMed, Embase and the Cochrane Central Database for data from patients with SARS-CoV-2 detected prior to or within 7 days after symptomatic ICH. We did a pooled analysis of individual patient data, then combined data from this pooled analysis with aggregate-level data. Results We included data from 139 patients (98 with individual data and 41 with aggregate-level data). In our pooled individual data analysis, the median age (IQR) was 60 (53–67) years and 64% (95% CI 54–73.7%) were male; 79% (95% CI 70.0–86.9%) had critically severe COVID-19. The pooled prevalence of lobar ICH was 67% (95% CI 56.3–76.0%), and of multifocal ICH was 36% (95% CI 26.4–47.0%). 71% (95% CI 61.0–80.4%) of patients were treated with anticoagulation (58% (95% CI 48–67.8%) therapeutic). The median NIHSS was 28 (IQR 15–28); mortality was 54% (95% CI 43.7–64.2%). Our combined analysis of individual and aggregate data showed similar findings. The pooled incidence of ICH across 12 cohort studies of inpatients with COVID-19 (n = 63,390) was 0.38% (95% CI 0.22–0.58%). Conclusions Our data suggest that ICH associated with COVID-19 has different characteristics compared to ICH not associated with COVID-19, including frequent lobar location and multifocality, a high rate of anticoagulation, and high mortality. These observations suggest different underlying mechanisms of ICH in COVID-19 with potential implications for clinical treatment and trials.


2006 ◽  
Vol 2006 ◽  
pp. 1-18 ◽  
Author(s):  
D. G. Steel ◽  
M. Tranmer ◽  
D. Holt

Ecological analysis involves analysing aggregate data for groups of individuals to make inferences about relationships at the individual level. Often the results of such analyses give badly biased estimates. This paper will consider the sources of bias in linear regression analysis using aggregate data. The role of variation of the individual level relationships between groups and the consequent within-group correlations and how these are related to auxiliary variables that characterise the differences between groups is considered. A method of adjusting ecological regression for the effects of auxiliary variables is described and evaluated using data from the 1991 Australian Census.


1983 ◽  
Vol 32 (1-2) ◽  
pp. 47-56 ◽  
Author(s):  
S. K. Srivastava ◽  
H. S. Jhajj

For estimating the mean of a finite population, Srivastava and Jhajj (1981) defined a broad class of estimators which we information of the sample mean as well as the sample variance of an auxiliary variable. In this paper we extend this class of estimators to the case when such information on p(> 1) auxiliary variables is available. The estimators of the class involve unknown constants whose optimum values depend on unknown population parameters. When these population parameters are replaced by their consistent estimates, the resulting estimators are shown to have the same asymptotic mean squared error. An expression by which the mean squared error of such estimators is smaller than those which use only the population means of the auxiliary variables, is obtained.


2019 ◽  
Author(s):  
Nathalia Saffioti Rezende ◽  
Paul Swinton ◽  
Luana Farias de Oliveira ◽  
Rafa Pires da Silva ◽  
Vinicius Eira da Silva ◽  
...  

ABSTRACTBeta-alanine (BA) supplementation increases muscle carnosine content (MCarn), and is ergogenic in many situations. Currently, many questions on the nature of the Mcarn response to supplementation are open, and the response to these has considerable potential to enhance the efficacy and applications of this supplementation strategy.ObjectiveTo conduct a Bayesian analysis of available data on the Mcarn response to BA supplementation.MethodsA systematic review with meta-analysis of individual and published aggregate data using a dose response (Emax) model was conducted. The protocol was designed according to PRISMA guidelines. A three-step screening strategy was undertaken to identify studies that measured the Mcarn response to BA supplementation. In addition, individual data from 5 separate studies conducted in the authors’ laboratory were analysed. Data were extracted from all controlled and uncontrolled supplementation studies conducted on healthy humans. Meta-regression was used to consider the influence of potential moderators (including dose, sex, age, baseline Mcarn and analysis method used) on the primary outcome.Results and ConclusionThe Emax model indicated that human skeletal muscle has large capacity for non-linear Mcarn accumulation, and that commonly used BA supplementation protocols may not come close to saturating muscle carnosine content. Neither baseline values, nor sex, appear to influence subsequent response to supplementation. Analysis of individual data indicated that Mcarn is relatively stable in the absence of intervention, and effectually all participants respond to BA supplementation (99.3% response [95%CrI: 96.2 – 100]).


2021 ◽  
Vol 7 (3) ◽  
pp. 4592-4613
Author(s):  
Sohaib Ahmad ◽  
◽  
Sardar Hussain ◽  
Muhammad Aamir ◽  
Faridoon Khan ◽  
...  

<abstract><p>This paper addresses the issue of estimating the population mean for non-response using simple random sampling. A new family of estimators is proposed for estimating the population mean with auxiliary information on the sample mean and the rank of the auxiliary variable. Bias and mean square errors of existing and proposed estimators are obtained using the first order of measurement. Theoretical comparisons are made of the performance of the proposed and existing estimators. We show that the proposed family of estimators is more efficient than existing estimators in the literature under the given constraints using these theoretical comparisons.</p></abstract>


Author(s):  
Uzma Yasmeen ◽  
Muhammad Noor-ul-Amin

The efficiency of the study variable can be improved by incorporating the information from the known auxiliary variables. Usually two techniques ratio and regression estimation are used with the help of auxiliary information in different approaches to acquire the high precision of the estimators. Considering the very heterogeneous population to get the size of the sample it may be originating impossible to get a sufficiently accurate and precise estimate by taking the simple random sampling technique from the complete population. Occasionally taking sample issue may differ significantly in different part of the entire population. For example, under study population consists of people living in apartments, own homes, hospitals and prisons or people living in plain regions and hill regions so in such situations the stratified sampling is one of the most commonly used approach to get a representative sample in survey sampling from different cross units of the population. The present study is set out on the recommendation of generalized variance estimators for finite population variance incorporating stratified sampling scheme with the information of single and two transformed auxiliary variables. The expressions of bias and mean square error (MSE) are obtained for the advised exponential type estimators. The conditions are obtained for which the anticipated estimators are better than the usual estimator. An empirical and simulation study is conducted to prove the superiority of the recommended estimator.


Sign in / Sign up

Export Citation Format

Share Document