Analyzing survey data with complex sampling designs.

Author(s):  
Patrick E. Shrout ◽  
Jaime L. Napier
2021 ◽  
Vol 99 (Supplement_3) ◽  
pp. 63-63
Author(s):  
Sandra L Rodriguez-Zas

Abstract Companion animal researchers have been at the forefront of using survey methodologies to study dogs’ and cats’ dietary and health patterns in the general population. The reporting of survey results has increased in recent years, facilitated by the rise in internet access, the modest cost of conducting web surveys, and the capability to target surveys to pet owners through address lists collected by services and social media. Data from population surveys have the potential to garner unique and comprehensive information that complements the understanding offered by designed experiments. Recent developments in survey methodologies and the availability of user-friendly survey tools enable the collection of large-scale or even Big Data sets, not only in the number of survey responses but also in the number and type of variables measured. Irrespective of the sample size, the study of survey data necessitates the consideration of complex sampling designs and analysis approaches that reflect the nature of this data. An overview of the characteristics of complex sampling designs typical of survey data with applications to companion animal nutrition is presented. The fundamentals of the analytical approaches that are suitable for survey data are demonstrated, and procedures available to accommodate clustering, stratification, underrepresentation, and nonresponse are reviewed. Examples of survey data visualization and analysis strategies are presented.


2021 ◽  
Author(s):  
Aja Louise Murray ◽  
Anastasia Ushakova ◽  
Helen Wright ◽  
Tom Booth ◽  
Peter Lynn

Complex sampling designs involving features such as stratification, cluster sampling, and unequal selection probabilities are often used in large-scale longitudinal surveys to improve cost-effectiveness and ensure adequate sampling of small or under-represented groups. However, complex sampling designs create challenges when there is a need to account for non-random attrition; a near inevitability in social science longitudinal studies. In this article we discuss these challenges and demonstrate the application of weighting approaches to simultaneously account for non-random attrition and complex design in a large UK-population representative survey. Using an auto-regressive latent trajectory model with structured residuals (ALT-SR) to model the relations between relationship satisfaction and mental health in the Understanding Society study as an example, we provide guidance on implementation of this approach in both R and Mplus is provided. Two standard error estimation approaches are illustrated: pseudo-maximum likelihood robust estimation and Bootstrap resampling. A comparison of unadjusted and design-adjusted results also highlights that ignoring the complex survey designs when fitting structural equation models can result in misleading conclusions.


1997 ◽  
Vol 54 (3) ◽  
pp. 616-630 ◽  
Author(s):  
S J Smith

Trawl surveys using stratified random designs are widely used on the east coast of North America to monitor groundfish populations. Statistical quantities estimated from these surveys are derived via a randomization basis and do not require that a probability model be postulated for the data. However, the large sample properties of these estimates may not be appropriate for the small sample sizes and skewed data characteristic of bottom trawl surveys. In this paper, three bootstrap resampling strategies that incorporate complex sampling designs are used to explore the properties of estimates for small sample situations. A new form for the bias-corrected and accelerated confidence intervals is introduced for stratified random surveys. Simulation results indicate that the bias-corrected and accelerated confidence limits may overcorrect for the trawl survey data and that percentile limits were closer to the expected values. Nonparametric density estimates were used to investigate the effects of unusually large catches of fish on the bootstrap estimates and confidence intervals. Bootstrap variance estimates decreased as increasingly smoother distributions were assumed for the observations in the stratum with the large catch. Lower confidence limits generally increased with increasing smoothness but the upper bound depended upon assumptions about the shape of the distribution.


2020 ◽  
Vol 2020 (1) ◽  
pp. 1-20
Author(s):  
Lili Yao ◽  
Shelby Haberman ◽  
Daniel F. McCaffrey ◽  
J. R. Lockwood

2020 ◽  
Author(s):  
Anna-Carolina Haensch ◽  
Bernd Weiß

An increasing number of researchers pool, harmonize, and analyze survey data from different survey providers for their research questions. They aim to study heterogeneity between groups over a long period or examine smaller subgroups; research questions that can be impossible to answer with a single survey. This combination or pooling of data is known as individual person data (IPD) meta-analysis in medicine and psychology; in sociology, it is understood as part of ex-post survey harmonization (Granda et al 2010).However, in medicine or psychology, most original studies focus on treatment or intervention effect and apply experimental research designs to come to causal conclusions. In contrast, many sociological or economic studies are nonexperimental. In comparison to experimental data, survey-based data is subject to complex sampling and nonresponse. Ignoring the complex sampling design can lead to biased population inferences not only in population means and shares but also in regression coefficients, widely used in the social sciences (DuMouchel and Duncan 1983 and Solon et al. 2013). To account for complex sampling schemes or non-ignorable unit nonresponse, survey-based data often comes with survey weights. But how to use survey weights after pooling different surveys?We will build upon the work done by DuMouchel and Duncan (1983) and Solon et al. (2013) for survey-weighted regression analysis with a single data set. Through Monte Carlo (MC) simulations, we will show that endogenous sampling and heterogeneity of effects models require survey weighting to receive approximately unbiased estimates after ex-post survey harmonization. Second, we focus on a list of methodological questions: Do survey-weighted one-stage and two-stage (meta-)analytical approaches perform differently? Is it possible to include random effects, especially if we have to assume study heterogeneity? Another challenging methodological question is the inclusion of random effects in a one-stage analysis.Our simulations show that two-stage analysis will be biased if the weights' variation is high, whereas one-stage analysis remains unbiased. We also show that the inclusion of random effects in a one-stage analysis is challenging but doable, i.e., weights must be transformed in most cases. Apart from the MC simulations, we also show the difference between two-stage and one-stage approaches with real-world data from same-sex couples in Germany.


Sign in / Sign up

Export Citation Format

Share Document