A Bayesian framework for modelling the preferential selection process in respondent-driven sampling

2021 ◽  
pp. 1471082X2110439
Author(s):  
Katherine R. McLaughlin

In sampling designs that utilize peer recruitment, the sampling process is partially unknown and must be modelled to make inference about the population and estimate standard outcomes like prevalence. We develop a Bayesian model for the recruitment process for respondent-driven sampling (RDS), a network sampling methodology used worldwide to sample hidden populations that are not reachable by conventional sampling techniques, including those at high risk for HIV/AIDS. Current models for the RDS sampling process typically assume that recruitment occurs randomly given the population social network, but this is likely untrue in practice. To model preferential selection on covariates, we develop a sequential two-sided rational choice framework, which allows generative probabilistic network models to be created for the RDS sampling process. In the rational choice framework, members of the population make recruitment and participation choices based on observable nodal and dyadic covariates to maximize their utility given constraints. Inference is made about recruitment preferences given the observed recruitment chain in a Bayesian framework by incorporating the latent utilities and sampling from the joint posterior distribution via Markov chain Monte Carlo. We present simulation results and apply the model to an RDS study of Francophone migrants in Rabat, Morocco.

2014 ◽  
Vol 2 (2) ◽  
pp. 298-301 ◽  
Author(s):  
JACOB C. FISHER ◽  
M. GIOVANNA MERLI

Respondent-driven sampling (RDS) is an increasingly popular chain-referral sampling method. Although it has proved effective at generating samples of hard to reach populations—meaning populations for which sampling frames are not available because they are hidden or socially stigmatized like sex workers or injecting drug users—quickly and cost-effectively, the ease of collecting the sample comes with a cost: bias or inefficiency in the estimates of population parameters (Gile & Handcock, 2010; Goel & Salganik, 2010). One way that RDS can produce inefficient estimates is if one or more of the recruitment chains gets stuck among members of a cohesive subpopulation, preventing the RDS sampling process from exploring other areas of the network. If that happens, members of the population subgroup recruit one another repeatedly, leading to an increase in sample size without increasing the diversity of the sample. This type of stickiness is particularly likely when hidden populations are stratified, and the stratified groups are organized into venues that provide opportunities to recruit other members of the same stratum. Female sex workers (FSW) in China, who are stratified into tiers of sex work that are correlated with marital status, age, and risk behaviors, are a prime example (Merli et al., 2014; Yamanis et al., 2013). Chinese FSW recruit clients from venues such as karaoke bars, massage parlors, or street corners. At larger venues, sex workers who participate in an RDS study might recruit other members of the same venue into the study at a higher rate than expected, leading to inefficient estimates. In short, the chain could get stuck in a venue.


2015 ◽  
Vol 31 (4) ◽  
pp. 723-736 ◽  
Author(s):  
Marinus Spreen ◽  
Stefan Bogaerts

Abstract Link-tracing designs are often used to estimate the size of hidden populations by utilizing the relational links between their members. A major problem in studies of hidden populations is the lack of a convenient sampling frame. The most frequently applied design in studies of hidden populations is respondent-driven sampling in which no sampling frame is used. However, in some studies multiple but incomplete sampling frames are available. In this article, we introduce the B-graph design that can be used in such situations. In this design, all available incomplete sampling frames are joined and turned into one sampling frame, from which a random sample is drawn and selected respondents are asked to mention their contacts. By considering the population as a bipartite graph of a two-mode network (those from the sampling frame and those who are not on the frame), the number of respondents who are directly linked to the sampling frame members can be estimated using Chao’s and Zelterman’s estimators for sparse data. The B-graph sampling design is illustrated using the data of a social network study from Utrecht, the Netherlands.


10.2196/12034 ◽  
2019 ◽  
Vol 5 (1) ◽  
pp. e12034 ◽  
Author(s):  
Katherine R McLaughlin ◽  
Lisa G Johnston ◽  
Laura J Gamble ◽  
Trdat Grigoryan ◽  
Arshak Papoyan ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document