Design characteristics and statistical methods used in interrupted time series studies evaluating public health interventions: a review

2020 ◽  
Vol 122 ◽  
pp. 1-11 ◽  
Author(s):  
Simon L. Turner ◽  
Amalia Karahalios ◽  
Andrew B. Forbes ◽  
Monica Taljaard ◽  
Jeremy M. Grimshaw ◽  
...  
BMJ Open ◽  
2019 ◽  
Vol 9 (1) ◽  
pp. e024096 ◽  
Author(s):  
Simon L Turner ◽  
Amalia Karahalios ◽  
Andrew B Forbes ◽  
Monica Taljaard ◽  
Jeremy M Grimshaw ◽  
...  

IntroductionAn interrupted time series (ITS) design is an important observational design used to examine the effects of an intervention or exposure. This design has particular utility in public health where it may be impracticable or infeasible to use a randomised trial to evaluate health system-wide policies, or examine the impact of exposures (such as earthquakes). There have been relatively few studies examining the design characteristics and statistical methods used to analyse ITS designs. Further, there is a lack of guidance to inform the design and analysis of ITS studies.This is the first study in a larger project that aims to provide tools and guidance for researchers in the design and analysis of ITS studies. The objectives of this study are to (1) examine and report the design characteristics and statistical methods used in a random sample of contemporary ITS studies examining public health interventions or exposures that impact on health-related outcomes, and (2) create a repository of time series data extracted from ITS studies. Results from this study will inform the remainder of the project which will investigate the performance of a range of commonly used statistical methods, and create a repository of input parameters required for sample size calculation.Methods and analysisWe will collate 200 ITS studies evaluating public health interventions or the impact of exposures. ITS studies will be identified from a search of the bibliometric database PubMed between the years 2013 and 2017, combined with stratified random sampling. From eligible studies, we will extract study characteristics, details of the statistical models and estimation methods, effect metrics and parameter estimates. Further, we will extract the time series data when available. We will use systematic review methods in the screening, application of inclusion and exclusion criteria, and extraction of data. Descriptive statistics will be used to summarise the data.Ethics and disseminationEthics approval is not required since information will only be extracted from published studies. Dissemination of the results will be through peer-reviewed publications and presentations at conferences. A repository of data extracted from the published ITS studies will be made publicly available.


Author(s):  
Michelle Degli Esposti ◽  
Thees Spreckelsen ◽  
Antonio Gasparrini ◽  
Douglas J Wiebe ◽  
Carl Bonander ◽  
...  

Abstract Interrupted time series designs are a valuable quasi-experimental approach for evaluating public health interventions. Interrupted time series extends a single group pre-post comparison by using multiple time points to control for underlying trends. But history bias—confounding by unexpected events occurring at the same time of the intervention—threatens the validity of this design and limits causal inference. Synthetic control methodology, a popular data-driven technique for deriving a control series from a pool of unexposed populations, is increasingly recommended. In this paper, we evaluate if and when synthetic controls can strengthen an interrupted time series design. First, we summarize the main observational study designs used in evaluative research, highlighting their respective uses, strengths, biases and design extensions for addressing these biases. Second, we outline when the use of synthetic controls can strengthen interrupted time series studies and when their combined use may be problematic. Third, we provide recommendations for using synthetic controls in interrupted time series and, using a real-world example, we illustrate the potential pitfalls of using a data-driven approach to identify a suitable control series. Finally, we emphasize the importance of theoretical approaches for informing study design and argue that synthetic control methods are not always well suited for generating a counterfactual that minimizes critical threats to interrupted time series studies. Advances in synthetic control methods bring new opportunities to conduct rigorous research in evaluating public health interventions. However, incorporating synthetic controls in interrupted time series studies may not always nullify important threats to validity nor improve causal inference.


Author(s):  
Michelle Degli Esposti ◽  
Thees Spreckelsen ◽  
Antonio Gasparrini ◽  
Douglas J Wiebe ◽  
Alexa R Yakubovich ◽  
...  

2020 ◽  
Author(s):  
Simon L Turner ◽  
Andrew B Forbes ◽  
Amalia Karahalios ◽  
Monica Taljaard ◽  
Joanne E McKenzie

AbstractInterrupted time series (ITS) studies are frequently used to evaluate the effects of population-level interventions or exposures. To our knowledge, no studies have compared the performance of different statistical methods for this design. We simulated data to compare the performance of a set of statistical methods under a range of scenarios which included different level and slope changes, varying lengths of series and magnitudes of autocorrelation. We also examined the performance of the Durbin-Watson (DW) test for detecting autocorrelation. All methods yielded unbiased estimates of the level and slope changes over all scenarios. The magnitude of autocorrelation was underestimated by all methods, however, restricted maximum likelihood (REML) yielded the least biased estimates. Underestimation of autocorrelation led to standard errors that were too small and coverage less than the nominal 95%. All methods performed better with longer time series, except for ordinary least squares (OLS) in the presence of autocorrelation and Newey-West for high values of autocorrelation. The DW test for the presence of autocorrelation performed poorly except for long series and large autocorrelation. From the methods evaluated, OLS was the preferred method in series with fewer than 12 points, while in longer series, REML was preferred. The DW test should not be relied upon to detect autocorrelation, except when the series is long. Care is needed when interpreting results from all methods, given confidence intervals will generally be too narrow. Further research is required to develop better performing methods for ITS, especially for short series.


2018 ◽  
Vol 72 (8) ◽  
pp. 673-678 ◽  
Author(s):  
Janet Bouttell ◽  
Peter Craig ◽  
James Lewsey ◽  
Mark Robinson ◽  
Frank Popham

BackgroundMany public health interventions cannot be evaluated using randomised controlled trials so they rely on the assessment of observational data. Techniques for evaluating public health interventions using observational data include interrupted time series analysis, panel data regression-based approaches, regression discontinuity and instrumental variable approaches. The inclusion of a counterfactual improves causal inference for approaches based on time series analysis, but the selection of a suitable counterfactual or control area can be problematic. The synthetic control method builds a counterfactual using a weighted combination of potential control units.MethodsWe explain the synthetic control method, summarise its use in health research to date, set out its advantages, assumptions and limitations and describe its implementation through a case study of life expectancy following German reunification.ResultsAdvantages of the synthetic control method are that it offers an approach suitable when there is a small number of treated units and control units and it does not rely on parallel preimplementation trends like difference in difference methods. The credibility of the result relies on achieving a good preimplementation fit for the outcome of interest between treated unit and synthetic control. If a good preimplementation fit is established over an extended period of time, a discrepancy in the outcome variable following the intervention can be interpreted as an intervention effect. It is critical that the synthetic control is built from a pool of potential controls that are similar to the treated unit. There is currently no consensus on what constitutes a ‘good fit’ or how to judge similarity. Traditional statistical inference is not appropriate with this approach, although alternatives are available. From our review, we noted that the synthetic control method has been underused in public health.ConclusionsSynthetic control methods are a valuable addition to the range of approaches for evaluating public health interventions when randomisation is impractical. They deserve to be more widely applied, ideally in combination with other methods so that the dependence of findings on particular assumptions can be assessed.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Simon L. Turner ◽  
Andrew B. Forbes ◽  
Amalia Karahalios ◽  
Monica Taljaard ◽  
Joanne E. McKenzie

Abstract Background Interrupted time series (ITS) studies are frequently used to evaluate the effects of population-level interventions or exposures. However, examination of the performance of statistical methods for this design has received relatively little attention. Methods We simulated continuous data to compare the performance of a set of statistical methods under a range of scenarios which included different level and slope changes, varying lengths of series and magnitudes of lag-1 autocorrelation. We also examined the performance of the Durbin-Watson (DW) test for detecting autocorrelation. Results All methods yielded unbiased estimates of the level and slope changes over all scenarios. The magnitude of autocorrelation was underestimated by all methods, however, restricted maximum likelihood (REML) yielded the least biased estimates. Underestimation of autocorrelation led to standard errors that were too small and coverage less than the nominal 95%. All methods performed better with longer time series, except for ordinary least squares (OLS) in the presence of autocorrelation and Newey-West for high values of autocorrelation. The DW test for the presence of autocorrelation performed poorly except for long series and large autocorrelation. Conclusions From the methods evaluated, OLS was the preferred method in series with fewer than 12 points, while in longer series, REML was preferred. The DW test should not be relied upon to detect autocorrelation, except when the series is long. Care is needed when interpreting results from all methods, given confidence intervals will generally be too narrow. Further research is required to develop better performing methods for ITS, especially for short series.


Sign in / Sign up

Export Citation Format

Share Document