Improved Real-Time Influenza Surveillance: Using Internet Search Data in Eight Latin American Countries

AbstractA real-time methodology for monitoring flu activity in middle income countries that is simultaneously accurate and generalizable has not yet been presented. We demonstrate here that a self-correcting machine learning method leveraging Internet-based search activity produces reliable and timely flu estimates in multiple Latin American countries.

Download Full-text

Use Internet search data to accurately track state level influenza epidemics

Scientific Reports ◽

10.1038/s41598-021-83084-5 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Shihao Yang ◽

Shaoyang Ning ◽

S. C. Kou

Keyword(s):

Real Time ◽

State Level ◽

The United States ◽

Search Pattern ◽

Influenza Surveillance ◽

Internet Search ◽

Spatial Correlation Structure ◽

Control And Prevention ◽

Influenza Activity ◽

Search Data

AbstractFor epidemics control and prevention, timely insights of potential hot spots are invaluable. Alternative to traditional epidemic surveillance, which often lags behind real time by weeks, big data from the Internet provide important information of the current epidemic trends. Here we present a methodology, ARGOX (Augmented Regression with GOogle data CROSS space), for accurate real-time tracking of state-level influenza epidemics in the United States. ARGOX combines Internet search data at the national, regional and state levels with traditional influenza surveillance data from the Centers for Disease Control and Prevention, and accounts for both the spatial correlation structure of state-level influenza activities and the evolution of people’s Internet search pattern. ARGOX achieves on average 28% error reduction over the best alternative for real-time state-level influenza estimation for 2014 to 2020. ARGOX is robust and reliable and can be potentially applied to track county- and city-level influenza activity and other infectious diseases.

Download Full-text

Improved Real-Time Influenza Surveillance: Using Internet Search Data in Eight Latin American Countries (Preprint)

10.2196/preprints.12214 ◽

2018 ◽

Author(s):

Leonardo Clemente ◽

Fred Lu ◽

Mauricio Santillana

Keyword(s):

Machine Learning ◽

Latin America ◽

Latin American ◽

Influenza Surveillance ◽

Surveillance Systems ◽

Internet Search ◽

Search Activity ◽

Time Period ◽

Out Of Sample ◽

Latin American Countries

BACKGROUND Novel influenza surveillance systems that leverage Internet-based real-time data sources including Internet search frequencies, social-network information, and crowd-sourced flu surveillance tools have shown improved accuracy over the past few years in data-rich countries like the United States. These systems not only track flu activity accurately, but they also report flu estimates a week or more ahead of the publication of reports produced by healthcare-based systems, such as those implemented and managed by the Centers for Disease Control and Prevention. Previous work has shown that the predictive capabilities of novel flu surveillance systems, like Google Flu Trends (GFT), in developing countries in Latin America have not yet delivered acceptable flu estimates. OBJECTIVE The aim of this study was to show that recent methodological improvements on the use of Internet search engine information to track diseases can lead to improved retrospective flu estimates in multiple countries in Latin America. METHODS A machine learning-based methodology that uses flu-related Internet search activity and historical information to monitor flu activity, named ARGO (AutoRegression with Google search), was extended to generate flu predictions for 8 Latin American countries (Argentina, Bolivia, Brazil, Chile, Mexico, Paraguay, Peru, and Uruguay) for the time period: January 2012 to December of 2016. These retrospective (out-of-sample) Influenza activity predictions were compared with historically observed flu suspected cases in each country, as reported by Flunet, an influenza surveillance database maintained by the World Health Organization. For a baseline comparison, retrospective (out-of-sample) flu estimates were produced for the same time period using autoregressive models that only leverage historical flu activity information. RESULTS Our results show that ARGO-like models’ predictive power outperform autoregressive models in 6 out of 8 countries in the 2012-2016 time period. Moreover, ARGO significantly improves on historical flu estimates produced by the now discontinued GFT for the time period of 2012-2015, where GFT information is publicly available. CONCLUSIONS We demonstrate here that a self-correcting machine learning method, leveraging Internet-based disease-related search activity and historical flu trends, has the potential to produce reliable and timely flu estimates in multiple Latin American countries. This methodology may prove helpful to local public health officials who design and implement interventions aimed at mitigating the effects of influenza outbreaks. Our methodology generally outperforms both the now-discontinued tool GFT, and autoregressive methodologies that exploit only historical flu activity to produce future disease estimates.

Download Full-text

Toward the use of neural networks for influenza prediction at multiple spatial resolutions

Science Advances ◽

10.1126/sciadv.abb1237 ◽

2021 ◽

Vol 7 (25) ◽

pp. eabb1237

Author(s):

Emily L. Aiken ◽

Andre T. Nguyen ◽

Cecile Viboud ◽

Mauricio Santillana

Keyword(s):

Neural Network ◽

Machine Learning ◽

Real Time ◽

The United States ◽

Network Approach ◽

Internet Search ◽

Learning Methods ◽

Neural Network Approach ◽

Machine Learning Methods ◽

Search Data

Mitigating the effects of disease outbreaks with timely and effective interventions requires accurate real-time surveillance and forecasting of disease activity, but traditional health care–based surveillance systems are limited by inherent reporting delays. Machine learning methods have the potential to fill this temporal “data gap,” but work to date in this area has focused on relatively simple methods and coarse geographic resolutions (state level and above). We evaluate the predictive performance of a gated recurrent unit neural network approach in comparison with baseline machine learning methods for estimating influenza activity in the United States at the state and city levels and experiment with the inclusion of real-time Internet search data. We find that the neural network approach improves upon baseline models for long time horizons of prediction but is not improved by real-time internet search data. We conduct a thorough analysis of feature importances in all considered models for interpretability purposes.

Download Full-text

Use of daily Internet search query data improves real-time projections of influenza epidemics

Journal of The Royal Society Interface ◽

10.1098/rsif.2018.0220 ◽

2018 ◽

Vol 15 (147) ◽

pp. 20180220 ◽

Cited By ~ 3

Author(s):

Christoph Zimmer ◽

Sequoia I. Leuba ◽

Reza Yaesoubi ◽

Ted Cohen

Keyword(s):

Real Time ◽

Temporal Resolution ◽

Internet Search ◽

Forecasting Performance ◽

The Us ◽

Control And Prevention ◽

Influenza Activity ◽

The Usa ◽

Search Data ◽

Substantial Interest

Seasonal influenza causes millions of illnesses and tens of thousands of deaths per year in the USA alone. While the morbidity and mortality associated with influenza is substantial each year, the timing and magnitude of epidemics are highly variable which complicates efforts to anticipate demands on the healthcare system. Better methods to forecast influenza activity would help policymakers anticipate such stressors. The US Centers for Disease Control and Prevention (CDC) has recognized the importance of improving influenza forecasting and hosts an annual challenge for predicting influenza-like illness (ILI) activity in the USA. The CDC data serve as the reference for ILI in the USA, but this information is aggregated by epidemiological week and reported after a one-week delay (and may be subject to correction even after this reporting lag). Therefore, there has been substantial interest in whether real-time Internet search data, such as Google, Twitter or Wikipedia could be used to improve influenza forecasting. In this study, we combine a previously developed calibration and prediction framework with an established humidity-based transmission dynamic model to forecast influenza. We then compare predictions based on only CDC ILI data with predictions that leverage the earlier availability and finer temporal resolution of Wikipedia search data. We find that both the earlier availability and the finer temporal resolution are important for increasing forecasting performance. Using daily Wikipedia search data leads to a marked improvement in prediction performance compared to weekly data especially for a three- to four-week forecasting horizon.

Download Full-text

Predicting Dengue Incidence Leveraging Internet-Based Data Sources. A Case Study in 20 cities in Brazil

10.1101/2020.10.21.20210948 ◽

2020 ◽

Author(s):

Gal Koplewitz ◽

Fred Lu ◽

Leonardo Clemente ◽

Caroline Buckee ◽

Mauricio Santillana

Keyword(s):

Random Forest ◽

Real Time ◽

Mosquito Control ◽

National Level ◽

Epidemiological Data ◽

Data Sources ◽

Surveillance Systems ◽

Internet Search ◽

Search Data ◽

The City

AbstractThe dengue virus affects millions of people every year worldwide, causing large epidemic outbreaks that disrupt people’s lives and severely strain healthcare systems. In the absence of a reliable vaccine against it or an effective treatment to manage the illness in humans, most efforts to combat dengue infections have focused on preventing its vectors, mainly the Aedes aegypti mosquito, from flourishing across the world. These mosquito-control strategies need reliable disease activity surveillance systems to be deployed. Despite significant efforts to estimate dengue incidence using a variety of data sources and methods, little work has been done to understand the relative contribution of the different data sources to improved prediction. Additionally, most work has focused on prediction systems at the national level, rather than at finer spatial resolutions. We develop a methodological framework to assess and compare dengue incidence estimates at the city level and evaluate the performance of a collection of models on 20 different cities in Brazil. The data sources we use towards this end are weekly incidence counts from prior years (seasonal autoregressive terms), weekly-aggregated weather variables, and real-time internet search data. We find that a random forest-based model effectively leverages these multiple data sources and provides robust predictions, while retaining interpretability. For real-time predictions that assume long delays (6-8 weeks) in the availability of epidemiological data, we find that real-time internet search data are the strongest predictors of Dengue incidence, whereas for predictions that assume very short delays (1-2 weeks), short-term and seasonal autocorrelation are dominant as predictors. Despite the difficulties inherent to city-level prediction, our framework achieves meaningful and actionable estimates across cities with different characteristics.Author SummaryAs the incidence of infectious diseases like dengue continues to increase throughout world, tracking their spread in real time poses a significant challenge to local and national health authorities. Accurate incidence data are often impossible to obtain as outbreaks emerge and unfold, and a range of nowcasting tools have been developed to estimate disease trends using different mathematical methodologies to fill the temporal data gap. Over the past several years, researchers have investigated how to best incorporate internet search data into predictive models, since these can be obtained in real-time. Still, most such models have been regression-based, and have tended to underperform in cases when epidemiological data are only available after long reporting delays. Moreover, in tropical countries, these models have previously been tested and applied primarily at the national level. Here, we develop a machine learning model based on a random forest approach and apply it in 20 cities in Brazil. We find that our methodology produces meaningful and actionable disease estimates at the city level, and that it is more robust to delays in the availability of epidemiological data than regression-based models.

Download Full-text

Test Use in Spain, Portugal and Latin American Countries

European Journal of Psychological Assessment ◽

10.1027//1015-5759.15.2.151 ◽

1999 ◽

Vol 15 (2) ◽

pp. 151-157 ◽

Cited By ~ 23

Author(s):

José Muñiz ◽

Gerardo Prieto ◽

Leandro Almeida ◽

Dave Bartram

Keyword(s):

Latin American ◽

Ad Hoc ◽

Task Force ◽

Test Construction ◽

European Federation ◽

Psychometric Characteristics ◽

Testing Practices ◽

Professional Psychologists ◽

Test Use ◽

Latin American Countries

Summary: The two main sources of errors in educational and psychological evaluation are the lack of adequate technical and psychometric characteristics of the tests, and especially the failure to properly implement the testing process. The main goal of the present research is to study the situation of test construction and test use in the Spanish-speaking (Spain and Latin American countries) and Portuguese-speaking (Portugal and Brazil) countries. The data were collected using a questionnaire constructed by the European Federation of Professional Psychologists Association (EFPPA) Task Force on Tests and Testing, under the direction of D. Bartram . In addition to the questionnaire, other ad hoc data were also gathered. Four main areas of psychological testing were investigated: Educational, Clinical, Forensic and Work. Key persons were identified in each country in order to provide reliable information. The main results are presented, and some measures that could be taken in order to improve the current testing practices in the countries surveyed are discussed. As most of the tests used in these countries were originally developed in other cultures, a problem that appears to be especially relevant is the translation and adaptation of tests.

Download Full-text

Progress on the testing movement in Iberian-Latin American Countries

PsycEXTRA Dataset ◽

10.1037/e508662012-005 ◽

2001 ◽

Author(s):

Solange Wechslet

Keyword(s):

Latin American ◽

Latin American Countries

Download Full-text

Test movement in Ibero-Latin American countries

PsycEXTRA Dataset ◽

10.1037/e508502012-004 ◽

2010 ◽

Author(s):

Solange Muglia Wechsler ◽

Maria Perez Solis ◽

Conceicao Ferreira ◽

Isabel Magno ◽

Norma Contini ◽

...

Keyword(s):

Latin American ◽

Latin American Countries

Download Full-text

The Spanish Translation of Les Leçons de chimie élémentaire: On the Legal Status of Translation and its Various Values

Comparative Critical Studies ◽

10.3366/ccs.2019.0327 ◽

2019 ◽

Vol 16 (2-3) ◽

pp. 201-215

Author(s):

Tania P. Hernández-Hernández

Keyword(s):

Complex Network ◽

Latin American ◽

Legal System ◽

Legal Status ◽

Spanish Translation ◽

Cultural Economic ◽

Translation Rights ◽

International Legal System ◽

Latin American Countries ◽

Legal Battle

Throughout the nineteenth century, European booksellers and publishers, mostly from France, England, Germany and Spain, produced textual materials in Europe and introduced them into Mexico and other Latin American countries. These transatlantic interchanges unfolded against the backdrop of the emergence of the international legal system to protect translation rights and required the involvement of a complex network of agents who carried with them publishing, translating and negotiating practices, in addition to books, pamphlets, prints and other goods. Tracing the trajectories of translated books and the socio-cultural, economic and legal forces shaping them, this article examines the legal battle over the translation and publishing rights of Les Leçons de chimie élémentaire, a chemistry book authored by Jean Girardin and translated and published in Spanish by Jean-Frédéric Rosa. Drawing on a socio-historical approach to translation, I argue that the arguments presented by both parties are indicative of the uncertainty surrounding the legal status of translated texts and of the different values then attributed to translation.

Download Full-text