scholarly journals Public health in genetic spaces: a statistical framework to optimize cluster-based outbreak detection

2019 ◽  
Author(s):  
Connor Chato ◽  
Marcia L. Kalish ◽  
Art F. Y. Poon

AbstractGenetic clustering is a popular method for characterizing variation in transmission rates for rapidly-evolving viruses, and could potentially be used to detect outbreaks in ‘near real time’. However, the statistical properties of clustering are poorly understood in this context, and there are no objective guidelines for setting clustering criteria. Here we develop a new statistical framework to optimize a genetic clustering method based on the ability to forecast new cases. We analyzed the pairwise Tamura-Nei (TN93) genetic distances for anonymized HIV-1 subtype B pol sequences from Seattle (n = 1, 653) and Middle Tennessee, USA (n = 2, 779), and northern Alberta, Canada (n = 809). Under varying TN93 thresholds, we fit two models to the distributions of new cases relative to clusters of known cases: (1) a null model that assumes cluster growth is strictly proportional to cluster size, i.e., no variation in transmission rates among individuals; and (2) a weighted model that incorporates individual-level covariates, such as recency of diagnosis. The optimal threshold maximizes the difference in information loss between models, where covariates are used most effectively. Optimal TN93 thresholds varied substantially between data sets, e.g., 0.0104 in Alberta and 0.016 in Seattle and Tennessee, such that the optimum for one population will potentially mis-direct prevention efforts in another. The range of thresholds where the weighted model conferred greater predictive accuracy tended to be narrow (±0.005 units), but the optimal threshold for a given population also tended to be stable over time. We also extended our method to demonstrate that variation in recency of HIV diagnosis among clusters was significantly more predictive of new cases than sample collection dates (ΔAIC> 50). These results demonstrate that one cannot rely on historical precedence or convention to configure genetic clustering methods for public health applications. Our framework not only provides an objective procedure to optimize a clustering method, but can also be used for variable selection in forecasting new cases.

2020 ◽  
Vol 6 (1) ◽  
Author(s):  
Connor Chato ◽  
Marcia L Kalish ◽  
Art F Y Poon

Abstract Genetic clustering is a popular method for characterizing variation in transmission rates for rapidly evolving viruses, and could potentially be used to detect outbreaks in ‘near real time’. However, the statistical properties of clustering are poorly understood in this context, and there are no objective guidelines for setting clustering criteria. Here, we develop a new statistical framework to optimize a genetic clustering method based on the ability to forecast new cases. We analysed the pairwise Tamura-Nei (TN93) genetic distances for anonymized HIV-1 subtype B pol sequences from Seattle (n = 1,653) and Middle Tennessee, USA (n = 2,779), and northern Alberta, Canada (n = 809). Under varying TN93 thresholds, we fit two models to the distributions of new cases relative to clusters of known cases: 1, a null model that assumes cluster growth is strictly proportional to cluster size, i.e. no variation in transmission rates among individuals; and 2, a weighted model that incorporates individual-level covariates, such as recency of diagnosis. The optimal threshold maximizes the difference in information loss between models, where covariates are used most effectively. Optimal TN93 thresholds varied substantially between data sets, e.g. 0.0104 in Alberta and 0.016 in Seattle and Tennessee, such that the optimum for one population would potentially misdirect prevention efforts in another. For a given population, the range of thresholds where the weighted model conferred greater predictive accuracy tended to be narrow (±0.005 units), and the optimal threshold tended to be stable over time. Our framework also indicated that variation in the recency of HIV diagnosis among clusters was significantly more predictive of new cases than sample collection dates (ΔAIC > 50). These results suggest that one cannot rely on historical precedence or convention to configure genetic clustering methods for public health applications, especially when translating methods between settings of low-level and generalized epidemics. Our framework not only enables investigators to calibrate a clustering method to a specific public health setting, but also provides a variable selection procedure to evaluate different predictive models of cluster growth.


2017 ◽  
Author(s):  
Rosemary M McCloskey ◽  
Art FY Poon

AbstractClustering infections by genetic similarity is a popular technique for identifying potential outbreaks of infectious disease, in part because sequences are now routinely collected for clinical management of many infections. A diverse number of nonparametric clustering methods have been developed for this purpose. These methods are generally intuitive, rapid to compute, and readily scale with large data sets. However, we have found that nonparametric clustering methods can be biased towards identifying clusters of diagnosis — where individuals are sampled sooner post-infection — rather than the clusters of rapid transmission that are meant to be potential foci for public health efforts. We develop a fundamentally new approach to genetic clustering based on fitting a Markov-modulated Poisson process (MMPP), which represents the evolution of transmission rates along the tree relating different infections. We evaluated this model-based method alongside five nonparametric clustering methods using both simulated and actual HIV sequence data sets. For simulated clusters of rapid transmission, the MMPP clustering method obtained higher mean sensitivity (85%) and specificity (91%) than the nonparametric methods. When we applied these clustering methods to published HIV-1 sequences from a study cohort of men who have sex with men in Seattle, USA, we found that the MMPP method categorized about half (46%) as many individuals to clusters compared to the other methods, and that the MMPP clusters were more consistent with transmission outbreaks. This new approach to genetic clustering has significant implications for the application of pathogen sequence analysis to public health, where it is critical to robustly and accurately identify clusters for the most cost-effective deployment of resources.


Viruses ◽  
2022 ◽  
Vol 14 (1) ◽  
pp. 101
Author(s):  
Stefanos Limnaios ◽  
Evangelia Georgia Kostaki ◽  
Georgios Adamis ◽  
Myrto Astriti ◽  
Maria Chini ◽  
...  

Our aim was to estimate the date of the origin and the transmission rates of the major local clusters of subtypes A1 and B in Greece. Phylodynamic analyses were conducted in 14 subtype A1 and 31 subtype B clusters. The earliest dates of origin for subtypes A1 and B were in 1982.6 and in 1985.5, respectively. The transmission rate for the subtype A1 clusters ranged between 7.54 and 39.61 infections/100 person years (IQR: 9.39, 15.88), and for subtype B clusters between 4.42 and 36.44 infections/100 person years (IQR: 7.38, 15.04). Statistical analysis revealed that the average difference in the transmission rate between the PWID and the MSM clusters was 6.73 (95% CI: 0.86 to 12.60; p = 0.026). Our study provides evidence that the date of introduction of subtype A1 in Greece was the earliest in Europe. Transmission rates were significantly higher for PWID than MSM clusters due to the conditions that gave rise to an extensive PWID HIV-1 outbreak ten years ago in Athens, Greece. Transmission rate can be considered as a valuable measure for public health since it provides a proxy of the rate of epidemic growth within a cluster and, therefore, it can be useful for targeted HIV prevention programs.


2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Ania Syrowatka ◽  
Masha Kuznetsova ◽  
Ava Alsubai ◽  
Adam L. Beckman ◽  
Paul A. Bain ◽  
...  

AbstractArtificial intelligence (AI) represents a valuable tool that could be widely used to inform clinical and public health decision-making to effectively manage the impacts of a pandemic. The objective of this scoping review was to identify the key use cases for involving AI for pandemic preparedness and response from the peer-reviewed, preprint, and grey literature. The data synthesis had two parts: an in-depth review of studies that leveraged machine learning (ML) techniques and a limited review of studies that applied traditional modeling approaches. ML applications from the in-depth review were categorized into use cases related to public health and clinical practice, and narratively synthesized. One hundred eighty-three articles met the inclusion criteria for the in-depth review. Six key use cases were identified: forecasting infectious disease dynamics and effects of interventions; surveillance and outbreak detection; real-time monitoring of adherence to public health recommendations; real-time detection of influenza-like illness; triage and timely diagnosis of infections; and prognosis of illness and response to treatment. Data sources and types of ML that were useful varied by use case. The search identified 1167 articles that reported on traditional modeling approaches, which highlighted additional areas where ML could be leveraged for improving the accuracy of estimations or projections. Important ML-based solutions have been developed in response to pandemics, and particularly for COVID-19 but few were optimized for practical application early in the pandemic. These findings can support policymakers, clinicians, and other stakeholders in prioritizing research and development to support operationalization of AI for future pandemics.


2018 ◽  
Vol 28 (6) ◽  
pp. 1826-1840 ◽  
Author(s):  
Theodore Lytras ◽  
Kassiani Gkolfinopoulou ◽  
Stefanos Bonovas ◽  
Baltazar Nunes

Timely detection of the seasonal influenza epidemic is important for public health action. We introduce FluHMM, a simple but flexible Bayesian algorithm to detect and monitor the seasonal epidemic on sentinel surveillance data. No comparable historical data are required for its use. FluHMM segments a typical influenza surveillance season into five distinct phases with clear interpretation (pre-epidemic, epidemic growth, epidemic plateau, epidemic decline and post-epidemic) and provides the posterior probability of being at each phase for every week in the period under surveillance, given the available data. An alert can be raised when the probability that the epidemic has started exceeds a given threshold. An accompanying R package facilitates the application of this method in public health practice. We apply FluHMM on 12 seasons of sentinel surveillance data from Greece, and show that it achieves very good sensitivity, timeliness and perfect specificity, thereby demonstrating its usefulness. We further discuss advantages and limitations of the method, providing suggestions on how to apply it and highlighting potential future extensions such as with integrating multiple surveillance data streams.


2017 ◽  
Vol 38 (4) ◽  
pp. 162
Author(s):  
Fiona J May

Culture independent diagnostic tests (CIDT) for detection of pathogens in clinical specimens have become widely adopted in Australian pathology laboratories. Pathology laboratories are the primary source of notification of pathogens to state and territory surveillance systems. Monitoring and analysis of surveillance data is integral to guiding public health actions to reduce the incidence of disease and respond to outbreaks. As with any change in testing protocol, the advantages and disadvantages of the change from culture based testing to culture independent testing need to be weighed up and the impact on surveillance and outbreak detection assessed. This article discusses the effect of this change in testing on surveillance and public health management of pathogens in Australia, with specific focus on gastrointestinal pathogens.


2009 ◽  
Vol 2009 ◽  
pp. 1-16 ◽  
Author(s):  
R. S. Sparks ◽  
T. Keighley ◽  
D. Muscatello

Automated public health records provide the necessary data for rapid outbreak detection. An adaptive exponentially weighted moving average (EWMA) plan is developed for signalling unusually high incidence when monitoring a time series of nonhomogeneous daily disease counts. A Poisson transitional regression model is used to fit background/expected trend in counts and provides “one-day-ahead” forecasts of the next day's count. Departures of counts from their forecasts are monitored. The paper outlines an approach for improving early outbreak data signals by dynamically adjusting the exponential weights to be efficient at signalling local persistent high side changes. We emphasise outbreak signals in steady-state situations; that is, changes that occur after the EWMA statistic had run through several in-control counts.


2010 ◽  
Vol 23 (3) ◽  
pp. 507-528 ◽  
Author(s):  
Gunther F. Craun ◽  
Joan M. Brunkard ◽  
Jonathan S. Yoder ◽  
Virginia A. Roberts ◽  
Joe Carpenter ◽  
...  

SUMMARY Since 1971, the CDC, EPA, and Council of State and Territorial Epidemiologists (CSTE) have maintained the collaborative national Waterborne Disease and Outbreak Surveillance System (WBDOSS) to document waterborne disease outbreaks (WBDOs) reported by local, state, and territorial health departments. WBDOs were recently reclassified to better characterize water system deficiencies and risk factors; data were analyzed for trends in outbreak occurrence, etiologies, and deficiencies during 1971 to 2006. A total of 833 WBDOs, 577,991 cases of illness, and 106 deaths were reported during 1971 to 2006. Trends of public health significance include (i) a decrease in the number of reported outbreaks over time and in the annual proportion of outbreaks reported in public water systems, (ii) an increase in the annual proportion of outbreaks reported in individual water systems and in the proportion of outbreaks associated with premise plumbing deficiencies in public water systems, (iii) no change in the annual proportion of outbreaks associated with distribution system deficiencies or the use of untreated and improperly treated groundwater in public water systems, and (iv) the increasing importance of Legionella since its inclusion in WBDOSS in 2001. Data from WBDOSS have helped inform public health and regulatory responses. Additional resources for waterborne disease surveillance and outbreak detection are essential to improve our ability to monitor, detect, and prevent waterborne disease in the United States.


Author(s):  
Niema Moshiri ◽  
Davey M. Smith ◽  
Siavash Mirarab

AbstractIn HIV epidemics, the structure of the transmission network can be dictated by just a few individuals. Public health intervention, such as ensuring people living with HIV adhere to antiretroviral therapy (ART) and are continually virally-suppressed, can help control the spread of the virus. However, such intervention requires utilizing the limited public health resource allocations. As a result, the ability to determine which individuals are most at-risk of transmitting HIV could allow public health officials to focus their limited resources on these individuals. Molecular epidemiology suggests an approach: prioritizing people living with HIV based on patterns of transmission inferred from their sampled viral sequences. In this paper, we introduce ProACT (Prioritization using AnCesTral edge lengths), a phylogenetic approach for prioritizing individuals living with HIV. ProACT uses a simple idea: ordering individuals by their terminal branch length in the phylogeny of their virus. In simulations and also on a dataset of HIV-1 subtype B pol sequences obtained in San Diego, we show that this simple strategy improves the effectiveness of prioritization compared to state-of-the-art methods that rely on monitoring the growth of transmission clusters defined based on genetic distance.


2017 ◽  
Vol 9 (1) ◽  
Author(s):  
Roger Morbey ◽  
Alex J. Elliot ◽  
Gillian E. Smith

ObjectiveTo investigate whether aberration detection methods for syndromicsurveillance would be more useful if data were stratified by age band.IntroductionWhen monitoring public health incidents using syndromicsurveillance systems, Public Health England (PHE) uses the ageof the presenting patient as a key indicator to further assess theseverity, impact of the incident, and to provide intelligence on thelikely cause. However the age distribution of cases is usually notconsidered until after unusual activity has been identified in the all-ages population data. We assessed whether monitoring specific agegroups contemporaneously could improve the timeliness, specificityand sensitivity of public health surveillance.MethodsFirst, we examined a wide range of health indicators from the PHEsyndromic surveillance systems to identify for further study thosewith the greatest seasonal variation in the age distribution of cases.Secondly, we examined the identified indicators to ascertain whetherany age bands consistently lagged behind other age bands. Finally,we applied outbreak detection methods retrospectively to age specificdata, identifying periods of increased activity that were only detectedor detected earlier when age-specific surveillance was used.ResultsSeasonal increases in respiratory indicators occurred first inyounger age groups, with increases in children under 5 providingearly warning of subsequent increases occurring in older age groups.Also, we found age specific indicators improved the specificity ofsurveillance using indicators relating to respiratory and eye problems;identifying unusual activity that was less apparent in the all-agespopulation.ConclusionsRoutine surveillance of respiratory indicators in young childrenwould have provided early warning of increases in older age groups,where the burden on health care usage, e.g. hospital admissions, isgreatest. Furthermore this cross-correlation between ages occurredconsistently even though the age distribution of the burden ofrespiratory cases varied between seasons. Age specific surveillancecan improve sensitivity of outbreak detection although all-agesurveillance remains more powerful when case numbers are low.


Sign in / Sign up

Export Citation Format

Share Document