Designing experiments informed by observational studies

Abstract The increasing availability of passively observed data has yielded a growing interest in “data fusion” methods, which involve merging data from observational and experimental sources to draw causal conclusions. Such methods often require a precarious tradeoff between the unknown bias in the observational dataset and the often-large variance in the experimental dataset. We propose an alternative approach, which avoids this tradeoff: rather than using observational data for inference, we use it to design a more efficient experiment. We consider the case of a stratified experiment with a binary outcome and suppose pilot estimates for the stratum potential outcome variances can be obtained from the observational study. We extend existing results to generate confidence sets for these variances, while accounting for the possibility of unmeasured confounding. Then, we pose the experimental design problem as a regret minimization problem subject to the constraints imposed by our confidence sets. We show that this problem can be converted into a concave maximization and solved using conventional methods. Finally, we demonstrate the practical utility of our methods using data from the Women’s Health Initiative.

Download Full-text

PROPENSITY SCORE MATCHING AS A MODERN STATISTICAL METHOD FOR BIAS CONTROL IN OBSERVATIONAL STUDIES WITH BINARY OUTCOME

Human Ecology ◽

10.33396/1728-0869-2016-5-50-64 ◽

2016 ◽

pp. 50-64 ◽

Cited By ~ 2

Author(s):

A. M. Grjibovski ◽

S. V. Ivanov ◽

М. A. Gorbatova ◽

A. A. Dyussupov

Keyword(s):

Propensity Score ◽

Statistical Method ◽

Propensity Score Matching ◽

Observational Studies ◽

Binary Outcome

Download Full-text

Characterization and selection of Japanese electronic health record databases used as data sources for non-interventional observational studies

10.21203/rs.3.rs-184585/v1 ◽

2021 ◽

Author(s):

Yumi Wakabayashi ◽

Masamitsu Eitoku ◽

Narufumi Suganuma

Keyword(s):

Electronic Health Record ◽

Observational Studies ◽

Large Scale ◽

Data Sources ◽

Flow Diagram ◽

Health Record ◽

Medical Institutions ◽

Data Source ◽

Electronic Health ◽

Using Data

Abstract Background Interventional studies are the fundamental method for obtaining answers to clinical question. However, these studies are sometimes difficult to conduct because of insufficient financial or human resources or the rarity of the disease in question. One means of addressing these issues is to conduct a non-interventional observational study using electronic health record (EHR) databases as the data source, although how best to evaluate the suitability of an EHR database when planning a study remains to be clarified. The aim of the present study is to identify and characterize the data sources that have been used for conducting non-interventional observational studies in Japan and propose a flow diagram to help researchers determine the most appropriate EHR database for their study goals. Methods We compiled a list of published articles reporting observational studies conducted in Japan by searching PubMed for relevant articles published in the last 3 years and by searching database providers’ publication lists related to studies using their databases. For each article, we reviewed the abstract and/or full text to obtain information about data source, target disease or therapeutic area, number of patients, and study design (prospective or retrospective). We then characterized the identified EHR databases. Results In Japan, non-interventional observational studies have been mostly conducted using data stored locally at individual medical institutions (713/1463) or collected from several collaborating medical institutions (351/1463). Whereas the studies conducted with large-scale integrated databases (195/1463) were mostly retrospective (68.2%), 27.2% of the single-center studies, 46.2% of the multi-center studies, and 74.4% of the post-marketing surveillance studies, identified in the present study, were conducted prospectively. Conclusions Our analysis revealed that the non-interventional observational studies were conducted using data stored local at individual medical institutions or collected from collaborating medical institutions in Japan. Disease registries, disease databases, and large-scale databases would enable researchers to conduct studies with large sample sizes to provide robust data from which strong inferences could be drawn. Using our flow diagram, researchers planning non-interventional observational studies should consider the strengths and limitations of each available database and choose the most appropriate one for their study goals. Trial registration Not applicable.

Download Full-text

Quantifying previous SARS-CoV-2 infection through mixture modelling of antibody levels

Nature Communications ◽

10.1038/s41467-021-26452-z ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

C. Bottomley ◽

M. Otiende ◽

S. Uyoga ◽

K. Gallagher ◽

E. W. Kagucia ◽

...

Keyword(s):

Threshold Analysis ◽

Vaccination Strategies ◽

Mixture Modelling ◽

External Data ◽

Alternative Approach ◽

Future Burden ◽

Previous Infection ◽

Using Data ◽

Antibody Levels ◽

Crude Estimate

AbstractAs countries decide on vaccination strategies and how to ease movement restrictions, estimating the proportion of the population previously infected with SARS-CoV-2 is important for predicting the future burden of COVID-19. This proportion is usually estimated from serosurvey data in two steps: first the proportion above a threshold antibody level is calculated, then the crude estimate is adjusted using external estimates of sensitivity and specificity. A drawback of this approach is that the PCR-confirmed cases used to estimate the sensitivity of the threshold may not be representative of cases in the wider population—e.g., they may be more recently infected and more severely symptomatic. Mixture modelling offers an alternative approach that does not require external data from PCR-confirmed cases. Here we illustrate the bias in the standard threshold-based approach by comparing both approaches using data from several Kenyan serosurveys. We show that the mixture model analysis produces estimates of previous infection that are often substantially higher than the standard threshold analysis.

Download Full-text

A Mixed-Methods Approach to Synthesizing Evidence on Mediators of Intervention Effects

Western Journal of Nursing Research ◽

10.1177/0193945911402365 ◽

2011 ◽

Vol 33 (7) ◽

pp. 870-900 ◽

Cited By ~ 5

Author(s):

Jennifer Leeman ◽

YunKyung Chang ◽

Corrine I. Voils ◽

Jamie L. Crandell ◽

Margarete Sandelowski

Keyword(s):

Mixed Methods ◽

Systematic Reviews ◽

Behavioral Change ◽

Observational Studies ◽

Intervention Studies ◽

Meta Analysis ◽

Intervention Effects ◽

Mediation Effects ◽

Mixed Methods Approach ◽

Using Data

Greater understanding of the mechanisms (mediators) by which behavioral-change interventions work is critical to developing theory and refining interventions. Although systematic reviews have been advocated as a method for exploring mediators, this is rarely done. One challenge is that intervention researchers typically test only two paths of the mediational model: the effect of the intervention on mediators and on outcomes. The authors addressed this challenge by drawing information not only from intervention studies but also from observational studies that provide data on associations between potential mediators and outcomes. They also reviewed qualitative studies of participants’ perceptions of why and how interventions worked. Using data from intervention ( n = 37) and quantitative observational studies ( n = 55), the authors conducted a meta-analysis of the mediation effects of eight variables. Qualitative findings ( n = 6) contributed to more in-depth explanations for findings. The methods used have potential to contribute to understanding of core mechanisms of behavioral-change interventions.

Download Full-text

Propensity scores: From naïve enthusiasm to intuitive understanding

Statistical Methods in Medical Research ◽

10.1177/0962280210394483 ◽

2011 ◽

Vol 21 (3) ◽

pp. 273-293 ◽

Cited By ~ 107

Author(s):

Elizabeth Williamson ◽

Ruth Morley ◽

Alan Lucas ◽

James Carpenter

Keyword(s):

Propensity Score ◽

Regression Models ◽

Propensity Scores ◽

Causal Effect ◽

Future Research ◽

Propensity Score Methods ◽

Advantages And Disadvantages ◽

Alternative Approach ◽

Using Data ◽

Unmeasured Confounders

Estimation of the effect of a binary exposure on an outcome in the presence of confounding is often carried out via outcome regression modelling. An alternative approach is to use propensity score methodology. The propensity score is the conditional probability of receiving the exposure given the observed covariates and can be used, under the assumption of no unmeasured confounders, to estimate the causal effect of the exposure. In this article, we provide a non-technical and intuitive discussion of propensity score methodology, motivating the use of the propensity score approach by analogy with randomised studies, and describe the four main ways in which this methodology can be implemented. We carefully describe the population parameters being estimated — an issue that is frequently overlooked in the medical literature. We illustrate these four methods using data from a study investigating the association between maternal choice to provide breast milk and the infant's subsequent neurodevelopment. We outline useful extensions of propensity score methodology and discuss directions for future research. Propensity score methods remain controversial and there is no consensus as to when, if ever, they should be used in place of traditional outcome regression models. We therefore end with a discussion of the relative advantages and disadvantages of each.

Download Full-text

Minireview: Effects of Different HT Formulations on Cognition

Endocrinology ◽

10.1210/en.2012-1175 ◽

2012 ◽

Vol 153 (8) ◽

pp. 3564-3570 ◽

Cited By ~ 36

Author(s):

Pauline M. Maki

Keyword(s):

Women’S Health ◽

Women's Health ◽

Postmenopausal Women ◽

Verbal Memory ◽

Observational Studies ◽

Negative Impact ◽

Women's Health Initiative ◽

Women’S Health Initiative ◽

Health Initiative ◽

The Women’S Health Initiative

Evidence from preclinical studies, randomized clinical trials (RCT), and observational studies underscores the importance of distinguishing among the different forms of estrogen and progestogens when evaluating the cognitive effects of hormone therapy (HT) in women. Despite this evidence, there is a lack of direct comparisons of different HT regimens. To provide insights into the effects of different HT formulations on cognition, this minireview focuses on RCT of verbal memory because evidence indicates that HT affects this cognitive domain more than others and because declines in verbal memory predict later development of Alzheimer's disease. Some observational studies indicate that estradiol confers benefits to verbal memory, whereas conjugated equine estrogens (CEE) confer risks. RCT to date show no negative impact of CEE on verbal memory, including the Women's Health Initiative Study of Cognitive Aging. Similarly, the Women's Health Initiative Memory Study showed no negative impact of CEE on dementia. Transdermal estradiol in younger postmenopausal women improved verbal memory in one small RCT but had no effect in another RCT. RCT of oral estradiol in younger and older postmenopausal women had neutral effects on cognitive function. In contrast, RCT show a negative impact of CEE plus medroxyprogesterone acetate on verbal memory in younger and older postmenopausal women. Small RCT show neutral or beneficial effects of other progestins on memory. Overall, RCT indicate that type of progestogen is a more important determinant of the effects of HT on memory than type of estrogen.

Download Full-text

A new approach for determining rice critical nitrogen concentration

The Journal of Agricultural Science ◽

10.1017/s0021859611000177 ◽

2011 ◽

Vol 149 (5) ◽

pp. 633-638 ◽

Cited By ~ 13

Author(s):

R. CONFALONIERI ◽

C. DEBELLINI ◽

M. PIRONDINI ◽

P. POSSENTI ◽

L. BERGAMINI ◽

...

Keyword(s):

Nutritional Status ◽

Cropping Systems ◽

Nitrogen Concentration ◽

Simulation Models ◽

Crop Models ◽

N Concentration ◽

Alternative Approach ◽

Using Data ◽

The University ◽

Critical N Concentration

SUMMARYA reliable evaluation of crop nutritional status is crucial for supporting fertilization aiming at maximizing qualitative and quantitative aspects of production and reducing the environmental impact of cropping systems. Most of the available simulation models evaluate crop nutritional status according to the nitrogen (N) dilution law, which derives critical N concentration as a function of above-ground biomass. An alternative approach, developed during a project carried out with students of the Cropping Systems Masters course at the University of Milan, was tested and compared with existing models (N dilution law and approaches implemented in EPIC and DAISY models). The new model (MAZINGA) reproduces the effect of leaf self-shading in lowering plant N concentration (PNC) through an inverse of the fraction of radiation intercepted by the canopy. The models were tested using data collected in four rice (Oryza sativaL.) experiments carried out in Northern Italy under potential and N-limited conditions. MAZINGA was the most accurate in identifying the critical N concentration, and therefore in discriminating PNC of plants growing under N-limited and non-limited conditions, respectively. In addition, the present work proved the effectiveness of crop models when used as tools for supporting education.

Download Full-text

Assessing potentially time-dependent treatment effect from clinical trials and observational studies for survival data, with applications to the Women's Health Initiative combined hormone therapy trial

Statistics in Medicine ◽

10.1002/sim.6453 ◽

2015 ◽

Vol 34 (11) ◽

pp. 1801-1817 ◽

Cited By ~ 7

Author(s):

Song Yang ◽

Ross L. Prentice

Keyword(s):

Clinical Trials ◽

Hormone Therapy ◽

Women's Health ◽

Treatment Effect ◽

Survival Data ◽

Observational Studies ◽

Time Dependent ◽

Health Initiative ◽

Therapy Trial ◽

The Women’S Health Initiative

Download Full-text

Do the rich pollute more? Mexican household consumption by income level and CO2 emissions

International Journal of Energy Sector Management ◽

10.1108/ijesm-07-2018-0016 ◽

2019 ◽

Vol 13 (3) ◽

pp. 694-712 ◽

Cited By ~ 3

Author(s):

Mónica Santillán Vera ◽

Angel de la Vega Navarro

Keyword(s):

Climate Change ◽

Household Consumption ◽

Mitigation Strategies ◽

Consumption Patterns ◽

Content Type ◽

Income Levels ◽

Alternative Approach ◽

Per Capita ◽

The Rich ◽

Using Data

Purpose The purpose of this paper is to quantitatively examine if varying household consumption activities at different income levels drove CO2 emissions to different degrees in Mexico from 1990 to 2014. Design/methodology/approach The paper applied a simple expenditure-CO2 emissions elasticity model – a top-down approach – using data from consumption-based CO2 emission inventories and the “Household Income and Expenditure Survey” and assuming a range of 0.7-1.0 elasticity values. Findings The paper results show a large carbon inequality among income groups in Mexico throughout the period. The household consumption patterns at the highest income levels are related to significantly more total CO2 emissions (direct + indirect) than the household consumption patterns at the lowest income levels, in absolute terms, per household and per capita. In 2014, for example, the poorest household decile emitted 1.6 tCO2 per capita on average, while the wealthiest decile reached 8.6 tCO2 per capita. Practical/implications The results suggest that it is necessary to rethink the effect of consumption patterns on climate change and the allocation of mitigation responsibilities, thus opening up complementary options for designing mitigation strategies and policies. Originality/value The paper represents an alternative approach for studying CO2 emissions responsibility in Mexico from the demand side, which has been practically absent in previous studies. The paper thereby opens a way for studying and discussing climate change in terms of consumption and equity in the country.

Download Full-text

Detection of Primitive Collective Behaviours in a Crowd Panic Simulation Based on Multi-Agent Approach

International Journal of Swarm Intelligence Research ◽

10.4018/jsir.2012070104 ◽

2012 ◽

Vol 3 (3) ◽

pp. 50-65 ◽

Cited By ~ 9

Author(s):

Jérémy Patrix ◽

Abdel-Illah Mouaddib ◽

Sylvain Gatepaille

Keyword(s):

Real Time ◽

Situation Awareness ◽

Hidden Markov ◽

Extended Model ◽

Multi Agent System ◽

Complex Behaviour ◽

Simulation Based ◽

Fusion Methods ◽

Multi Agent ◽

Using Data

In case of emergency and evacuation, it is often impossible to interpret manually the complex behaviour of a crowd, essentially due to the lack of staff and time needed to understand a situation. In the literature, a monitored system using data fusion methods makes it possible to perform automatic situation awareness. Using Swarm Intelligence domain, the authors propose an approach based on multi-agent system to simulate and detect primitive collective behaviours emerging from a crowd panic. It enables anticipating collective behaviours in real-time as well as their anomalies according to specific scenarios. Detection is the possibility to learn, recognize and anticipate different behaviours by a probabilistic model. The collective behaviour detection of a crowd panic in real-time is based on a learning method on an extended model of Hidden Markov Model. This paper presents experiments of simulation and detection using an implementation of a virtual environment.

Download Full-text