selection probabilities
Recently Published Documents


TOTAL DOCUMENTS

80
(FIVE YEARS 17)

H-INDEX

10
(FIVE YEARS 3)

Author(s):  
ZhiDi Deng ◽  
Senyo Agbeyaka ◽  
Esme Fuller-Thomson

Purpose The purpose of this study was to investigate Black–White differences associated with hearing loss among older adults living in the United States. Method Secondary data analysis was conducted using the 2017 American Community Survey (ACS) with a replication analysis of the 2016 ACS. The ACS is an annual nationally representative survey of Americans living in community settings and institutions. The sample size of older Americans (age 65+ years) in 2017 was 467,789 non–Hispanic Whites (NHWs) and 45,105 non–Hispanic Blacks (NHBs). In the 2016 ACS, there were 459,692 NHW and 45,990 NHB respondents. Measures of hearing loss, age, race/ethnicity, education level, and household income were based on self-report. Data were weighted to adjust for nonresponse and differential selection probabilities. Results The prevalence of hearing loss was markedly higher among older NHWs (15.4% in both surveys) in comparison with NHBs (9.0% in 2017 and 9.4% in 2016, both ethnic differences p < .001). In the 2017 ACS, the age- and sex-adjusted odds of hearing loss were 69% higher for NHWs compared with NHBs, which increased to 91% higher odds when household income and education level were also taken into account ( OR = 1.91; 95% confidence interval [CI; 1.85, 1.97]). Findings from the 2016 ACS were very similar (e.g., 65+ fully adjusted OR = 1.81; 95% CI [1.76, 1.87]). Conclusions NHWs have a much higher prevalence and almost double the odds of hearing loss compared with NHBs. Unfortunately, the ACS survey does not allow us to explore potential causal mechanisms behind this association.


2021 ◽  
Vol 50 (Supplement_1) ◽  
Author(s):  
Melissa Middleton ◽  
Margarita Moreno-Betancur ◽  
John Carlin ◽  
Katherine J Lee

Abstract Background Multiple imputation (MI) is commonly used to address missing data in epidemiological studies, but valid use requires compatibility between the imputation and analysis models. Case-cohort studies use unequal sampling probabilities for cases and controls which are often accounted for during analyses through inverse probability weighting (IPW). It is unclear how to apply MI for missing covariates while achieving compatibility in this setting. Methods A simulation study was conducted with missingness in two covariates, motivated by a case-cohort investigation within the Barwon Infant Study. MI methods considered involved including interactions between the outcome (as a proxy for weights) and analysis variables, stratification by weights, and ignoring weights, within the context of an IPW analysis. Factors such as the target estimand, proportion of incomplete observations, missing data mechanism and subcohort selection probabilities were varied to assess performance of MI methods. Results There was similar performance in terms of bias and efficiency across the MI methods, with expected improvements compared to IPW applied to the complete cases. Precision tended to decrease as the subcohort selection probability decreased. Similar results were observed irrespective of the proportion of incomplete cases. Conclusions Our results suggest that it makes little difference how weights are incorporated in the MI model in the analysis of case-cohort studies, potentially due to only two weight classes in this setting. Key messages If and how the weights are incorporated in the imputation model may have little impact in the analysis of case-cohort studies with incomplete covariates


Nature ◽  
2021 ◽  
Author(s):  
Bailey Flanigan ◽  
Paul Gölz ◽  
Anupam Gupta ◽  
Brett Hennig ◽  
Ariel D. Procaccia

AbstractGlobally, there has been a recent surge in ‘citizens’ assemblies’1, which are a form of civic participation in which a panel of randomly selected constituents contributes to questions of policy. The random process for selecting this panel should satisfy two properties. First, it must produce a panel that is representative of the population. Second, in the spirit of democratic equality, individuals would ideally be selected to serve on this panel with equal probability2,3. However, in practice these desiderata are in tension owing to differential participation rates across subpopulations4,5. Here we apply ideas from fair division to develop selection algorithms that satisfy the two desiderata simultaneously to the greatest possible extent: our selection algorithms choose representative panels while selecting individuals with probabilities as close to equal as mathematically possible, for many metrics of ‘closeness to equality’. Our implementation of one such algorithm has already been used to select more than 40 citizens’ assemblies around the world. As we demonstrate using data from ten citizens’ assemblies, adopting our algorithm over a benchmark representing the previous state of the art leads to substantially fairer selection probabilities. By contributing a fairer, more principled and deployable algorithm, our work puts the practice of sortition on firmer foundations. Moreover, our work establishes citizens’ assemblies as a domain in which insights from the field of fair division can lead to high-impact applications.


2021 ◽  
Author(s):  
Aja Louise Murray ◽  
Anastasia Ushakova ◽  
Helen Wright ◽  
Tom Booth ◽  
Peter Lynn

Complex sampling designs involving features such as stratification, cluster sampling, and unequal selection probabilities are often used in large-scale longitudinal surveys to improve cost-effectiveness and ensure adequate sampling of small or under-represented groups. However, complex sampling designs create challenges when there is a need to account for non-random attrition; a near inevitability in social science longitudinal studies. In this article we discuss these challenges and demonstrate the application of weighting approaches to simultaneously account for non-random attrition and complex design in a large UK-population representative survey. Using an auto-regressive latent trajectory model with structured residuals (ALT-SR) to model the relations between relationship satisfaction and mental health in the Understanding Society study as an example, we provide guidance on implementation of this approach in both R and Mplus is provided. Two standard error estimation approaches are illustrated: pseudo-maximum likelihood robust estimation and Bootstrap resampling. A comparison of unadjusted and design-adjusted results also highlights that ignoring the complex survey designs when fitting structural equation models can result in misleading conclusions.


2020 ◽  
Vol 24 (21) ◽  
pp. 15937-15949
Author(s):  
Giorgio Gnecco ◽  
Federico Nutarelli ◽  
Daniela Selvi

Abstract This paper is focused on the unbalanced fixed effects panel data model. This is a linear regression model able to represent unobserved heterogeneity in the data, by allowing each two distinct observational units to have possibly different numbers of associated observations. We specifically address the case in which the model includes the additional possibility of controlling the conditional variance of the output given the input and the selection probabilities of the different units per unit time. This is achieved by varying the cost associated with the supervision of each training example. Assuming an upper bound on the expected total supervision cost and fixing the expected number of observed units for each instant, we analyze and optimize the trade-off between sample size, precision of supervision (the reciprocal of the conditional variance of the output) and selection probabilities. This is obtained by formulating and solving a suitable optimization problem. The formulation of such a problem is based on a large-sample upper bound on the generalization error associated with the estimates of the parameters of the unbalanced fixed effects panel data model, conditioned on the training input dataset. We prove that, under appropriate assumptions, in some cases “many but bad” examples provide a smaller large-sample upper bound on the conditional generalization error than “few but good” ones, whereas in other cases the opposite occurs. We conclude discussing possible applications of the presented results, and extensions of the proposed optimization framework to other panel data models.


Author(s):  
Tom Brughmans ◽  
Alessandra Pecci

Amphora reuse is an inconvenient truth: the topic has received little attention in Roman studies, even though it certainly happened, and potentially on a huge scale. What was the effect of amphora reuse in the past? What data and methods can archaeologists use to evaluate it? How does the phenomenon affect our ability to interpret Roman amphora distributions as proxy evidence for the distribution of foodstuffs? In this chapter we summarize the theories and evidence used in the study of amphora reuse to identify the significant challenges involved in tackling this inconvenient truth. We argue for the need to do this through computational simulation modelling combined with residue analysis. As a proof-of-concept, we illustrate our approach through a simple abstract model simulating selected theories of amphora reuse: the differential effects on amphora distribution patterns of different probabilities of reuse at prime-use locales, and of reuse selection probabilities at port sites.


2020 ◽  
Author(s):  
Heather Bradley ◽  
Mansour Fahimi ◽  
Travis Sanchez ◽  
Ben Lopman ◽  
Martin Frankel ◽  
...  

UNSTRUCTURED Many months into the SARS-CoV-2 pandemic, basic epidemiologic parameters describing burden of disease are lacking. To reduce selection bias in current burden of disease estimates derived from diagnostic testing data or serologic testing in convenience samples, we are conducting a national probability-based sample SARS-CoV-2 serosurvey. Sampling from a national address-based frame and using mailed recruitment materials and test kits will allow us to estimate national prevalence of SARS-CoV-2 infection and antibodies, overall and by demographic, behavioral, and clinical characteristics. Data will be weighted for unequal selection probabilities and non-response and will be adjusted to population benchmarks. Due to the urgent need for these estimates, expedited interim weighting of serosurvey responses will be undertaken to produce early release estimates, which will be published on the study website, COVIDVu.org. Here, we describe a process for computing interim survey weights and guidelines for release of interim estimates.


2020 ◽  
Vol 36 (16) ◽  
pp. 4510-4512
Author(s):  
Giulio Isacchini ◽  
Carlos Olivares ◽  
Armita Nourmohammad ◽  
Aleksandra M Walczak ◽  
Thierry Mora

Abstract Summary Recent advances in modelling VDJ recombination and subsequent selection of T- and B-cell receptors provide useful tools to analyse and compare immune repertoires across time, individuals and tissues. A suite of tools—IGoR, OLGA and SONIA—have been publicly released to the community that allow for the inference of generative and selection models from high-throughput sequencing data. However, using these tools requires some scripting or command-line skills and familiarity with complex datasets. As a result, the application of the above models has not been available to a broad audience. In this application note, we fill this gap by presenting Simple OLGA & SONIA (SOS), a web-based interface where users with no coding skills can compute the generation and post-selection probabilities of their sequences, as well as generate batches of synthetic sequences. The application also functions on mobile phones. Availability and implementation SOS is freely available to use at sites.google.com/view/statbiophysens/sos with source code at github.com/statbiophys/sos.


Sign in / Sign up

Export Citation Format

Share Document