scholarly journals An Automated Approach for Finding Spatio-Temporal Patterns of Seasonal Influenza in the United States: Algorithm Validation Study

10.2196/12842 ◽  
2020 ◽  
Vol 6 (3) ◽  
pp. e12842
Author(s):  
Prathyush Sambaturu ◽  
Parantapa Bhattacharya ◽  
Jiangzhuo Chen ◽  
Bryan Lewis ◽  
Madhav Marathe ◽  
...  

Background Agencies such as the Centers for Disease Control and Prevention (CDC) currently release influenza-like illness incidence data, along with descriptive summaries of simple spatio-temporal patterns and trends. However, public health researchers, government agencies, as well as the general public, are often interested in deeper patterns and insights into how the disease is spreading, with additional context. Analysis by domain experts is needed for deriving such insights from incidence data. Objective Our goal was to develop an automated approach for finding interesting spatio-temporal patterns in the spread of a disease over a large region, such as regions which have specific characteristics (eg, high incidence in a particular week, those which showed a sudden change in incidence) or regions which have significantly different incidence compared to earlier seasons. Methods We developed techniques from the area of transactional data mining for characterizing and finding interesting spatio-temporal patterns in disease spread in an automated manner. A key part of our approach involved using the principle of minimum description length for representing a given target set in terms of combinations of attributes (referred to as clauses); we considered both positive and negative clauses, relaxed descriptions which approximately represent the set, and used integer programming to find such descriptions. Finally, we designed an automated approach, which examines a large space of sets corresponding to different spatio-temporal patterns, and ranks them based on the ratio of their size to their description length (referred to as the compression ratio). Results We applied our methods using minimum description length to find spatio-temporal patterns in the spread of seasonal influenza in the United States using state level influenza-like illness activity indicator data from the CDC. We observed that the compression ratios were over 2.5 for 50% of the chosen sets, when approximate descriptions and negative clauses were allowed. Sets with high compression ratios (eg, over 2.5) corresponded to interesting patterns in the spatio-temporal dynamics of influenza-like illness. Our approach also outperformed description by solution in terms of the compression ratio. Conclusions Our approach, which is an unsupervised machine learning method, can provide new insights into patterns and trends in the disease spread in an automated manner. Our results show that the description complexity is an effective approach for characterizing sets of interest, which can be easily extended to other diseases and regions beyond influenza in the US. Our approach can also be easily adapted for automated generation of narratives.

2018 ◽  
Author(s):  
Prathyush Sambaturu ◽  
Parantapa Bhattacharya ◽  
Jiangzhuo Chen ◽  
Bryan Lewis ◽  
Madhav Marathe ◽  
...  

BACKGROUND Agencies such as the Centers for Disease Control and Prevention (CDC) currently release influenza-like illness incidence data, along with descriptive summaries of simple spatio-temporal patterns and trends. However, public health researchers, government agencies, as well as the general public, are often interested in deeper patterns and insights into how the disease is spreading, with additional context. Analysis by domain experts is needed for deriving such insights from incidence data. OBJECTIVE Our goal was to develop an automated approach for finding interesting spatio-temporal patterns in the spread of a disease over a large region, such as regions which have specific characteristics (eg, high incidence in a particular week, those which showed a sudden change in incidence) or regions which have significantly different incidence compared to earlier seasons. METHODS We developed techniques from the area of transactional data mining for characterizing and finding interesting spatio-temporal patterns in disease spread in an automated manner. A key part of our approach involved using the principle of minimum description length for representing a given target set in terms of combinations of attributes (referred to as clauses); we considered both positive and negative clauses, relaxed descriptions which approximately represent the set, and used integer programming to find such descriptions. Finally, we designed an automated approach, which examines a large space of sets corresponding to different spatio-temporal patterns, and ranks them based on the ratio of their size to their description length (referred to as the compression ratio). RESULTS We applied our methods using minimum description length to find spatio-temporal patterns in the spread of seasonal influenza in the United States using state level influenza-like illness activity indicator data from the CDC. We observed that the compression ratios were over 2.5 for 50% of the chosen sets, when approximate descriptions and negative clauses were allowed. Sets with high compression ratios (eg, over 2.5) corresponded to interesting patterns in the spatio-temporal dynamics of influenza-like illness. Our approach also outperformed description by solution in terms of the compression ratio. CONCLUSIONS Our approach, which is an unsupervised machine learning method, can provide new insights into patterns and trends in the disease spread in an automated manner. Our results show that the description complexity is an effective approach for characterizing sets of interest, which can be easily extended to other diseases and regions beyond influenza in the US. Our approach can also be easily adapted for automated generation of narratives.


2021 ◽  
Author(s):  
Zhijuan Song ◽  
Xiaocan Jia ◽  
Junzhe Bao ◽  
Yongli Yang ◽  
Huili Zhu ◽  
...  

Abstract Introduction: About 8% of Americans get influenza during an average season from the Centers for Disease Control and Prevention in the United States. It is necessary to strengthen the early warning of influenza and the prediction of public health. Methods In this study, we analyzed the characteristics of Influenza-like Illness (ILI) by Geographic Information System and SARIMA model, respectively. Spatio-temporal cluster analysis detected 23 clusters of ILI during the study period. Results The highest incidence of ILI was mainly concentrated in the states of Louisiana, District of Columbia and Virginia. The Local spatial autocorrelation analysis revealed the High-High cluster was mainly located in Louisiana and Mississippi. This means that if the influenza incidence is high in Louisiana and Mississippi, the neighboring states will also have higher influenza incidence rates. The regression model SARIMA(1, 0, 0)(1, 1, 0)52 with statistical significance was obtained to forecast the ILI incidence of Mississippi. Conclusions The study showed, the ILI incidence will begin to increase in the 45th week 2020 and peak in the 6th week 2021. To conclude, notable epidemiological differences were observed across states, indicating that some states should pay more attention to prevent and control respiratory infectious diseases.


Author(s):  
Zhijuan Song ◽  
Xiaocan Jia ◽  
Junzhe Bao ◽  
Yongli Yang ◽  
Huili Zhu ◽  
...  

About 8% of the Americans contract influenza during an average season according to the Centers for Disease Control and Prevention in the United States. It is necessary to strengthen the early warning for influenza and the prediction of public health. In this study, Spatial autocorrelation analysis and spatial scanning analysis were used to identify the spatiotemporal patterns of influenza-like illness (ILI) prevalence in the United States, during the 2011–2020 transmission seasons. A seasonal autoregressive integrated moving average (SARIMA) model was constructed to predict the influenza incidence of high-risk states. We found the highest incidence of ILI was mainly concentrated in the states of Louisiana, District of Columbia and Virginia. Mississippi was a high-risk state with a higher influenza incidence, and exhibited a high-high cluster with neighboring states. A SARIMA (1, 0, 0) (1, 1, 0)52 model was suitable for forecasting the ILI incidence of Mississippi. The relative errors between actual values and predicted values indicated that the predicted values matched the actual values well. Influenza is still an important health problem in the United States. The spread of ILI varies by season and geographical region. The peak season of influenza was the winter and spring, and the states with higher influenza rates are concentrated in the southeast. Increased surveillance in high-risk states could help control the spread of the influenza.


2019 ◽  
Vol 264 ◽  
pp. 40-55 ◽  
Author(s):  
Marina Peña-Gallardo ◽  
Sergio M. Vicente-Serrano ◽  
Steven Quiring ◽  
Marc Svoboda ◽  
Jamie Hannaford ◽  
...  

2017 ◽  
Vol 5 (7) ◽  
pp. 771-788 ◽  
Author(s):  
A. Sankarasubramanian ◽  
J. L. Sabo ◽  
K. L. Larson ◽  
S. B. Seo ◽  
T. Sinha ◽  
...  

PEDIATRICS ◽  
1981 ◽  
Vol 67 (6) ◽  
pp. 924-926
Author(s):  
Dale L. Phelps

The number of infants blinded from retinopathy of prematurity in the United States in 1979 is estimated to be 546, based on birth-weight-specific published survival statistics and ROP incidence data. Approximately 2,100 infants will be affected by cicatricial disease annually. A simple formula is presented that permits estimation of incidence data based on other regional data. It is suggested that increased attention be focused on this old enemy in order to document its incidence worldwide and to learn more about its prevention and treatment.


2020 ◽  
Author(s):  
Johannes H. Uhl ◽  
Stefan Leyk ◽  
Caitlin M. McShane ◽  
Anna E. Braswell ◽  
Dylan S. Connor ◽  
...  

Abstract. The collection, processing and analysis of remote sensing data since the early 1970s has rapidly improved our understanding of change on the Earth’s surface. While satellite-based earth observation has proven to be of vast scientific value, these data are typically confined to recent decades of observation and often lack important thematic detail. Here, we advance in this arena by constructing new spatially-explicit settlement data for the United States that extend back to the early nineteenth century, and is consistently enumerated at fine spatial and temporal granularity (i.e., 250 m spatial, and 5 a temporal resolution). We create these time series using a large, novel building stock database to extract and map retrospective, fine-grained spatial distributions of built-up properties in the conterminous United States from 1810 to 2015. From our data extraction, we analyse and publish a series of gridded geospatial datasets that enable novel retrospective historical analysis of the built environment at unprecedented spatial and temporal resolution. The datasets are available at https://dataverse.harvard.edu/dataverse/hisdacus (Uhl and Leyk, 2020a, b, c, d).


Sign in / Sign up

Export Citation Format

Share Document