Identification of Significant Climatic Risk Factors and Machine Learning Models in Dengue Outbreak Prediction

Mapping Intimacies ◽

10.21203/rs.2.15755/v2 ◽

2019 ◽

Author(s):

Felestin Yavari Nejad ◽

Kasturi Dewi Varathan

Keyword(s):

Machine Learning ◽

Risk Factor ◽

Dengue Fever ◽

World Health ◽

Learning Models ◽

Dengue Outbreak ◽

Depth Analysis ◽

The World ◽

Dengue Outbreaks ◽

Machine Learning Models

Abstract Background: Dengue fever is a widespread viral disease and one of the world’s major pandemic vector-borne infections, causing serious hazard to humanity. The World Health Organisation (WHO) reported that the incidence of dengue fever has increased dramatically across the world in recent decades. WHO currently estimates an annual incidence of 50–100 million dengue infections worldwide. To date, no tested vaccine or treatment is available to stop or prevent dengue fever. Thus, the importance of predicting dengue outbreaks is significant. The current issue that should be addressed in dengue outbreak prediction is accuracy. A limited number of studies have conducted an in-depth analysis of climate factors in dengue outbreak prediction. Methods: The most important climatic factors that contribute to dengue outbreaks were identified in the current work. These factors were used as input parameters for machine learning models. The models were then tested and evaluated on the basis of four-years data (January 2010 to December 2013) collected in Malaysia. Results: This research has two major contributions. A new risk factor, called the TempeRain Factor (TRF), was identified and used as an input parameter for the model of dengue outbreak prediction. Moreover, TRF was applied to demonstrate its strong impact on dengue outbreaks. Experimental results showed that the Bayes Network model with the new meteorological risk factor identified in this study increased accuracy to 92.35% and reduced the root-mean-square error to 0.26 for predicting dengue outbreaks.

Download Full-text

Identification of significant climatic risk factors and machine learning models in dengue outbreak prediction

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-021-01493-y ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Felestin Yavari Nejad ◽

Kasturi Dewi Varathan

Keyword(s):

Machine Learning ◽

Risk Factor ◽

Dengue Fever ◽

Support Vector ◽

Learning Models ◽

Dengue Outbreak ◽

Bayes Network ◽

The World ◽

Dengue Outbreaks ◽

Machine Learning Models

Abstract Background Dengue fever is a widespread viral disease and one of the world’s major pandemic vector-borne infections, causing serious hazard to humanity. The World Health Organisation (WHO) reported that the incidence of dengue fever has increased dramatically across the world in recent decades. WHO currently estimates an annual incidence of 50–100 million dengue infections worldwide. To date, no tested vaccine or treatment is available to stop or prevent dengue fever. Thus, the importance of predicting dengue outbreaks is significant. The current issue that should be addressed in dengue outbreak prediction is accuracy. A limited number of studies have conducted an in-depth analysis of climate factors in dengue outbreak prediction. Methods The most important climatic factors that contribute to dengue outbreaks were identified in the current work. Correlation analyses were performed in order to determine these factors and these factors were used as input parameters for machine learning models. Top five machine learning classification models (Bayes network (BN) models, support vector machine (SVM), RBF tree, decision table and naive Bayes) were chosen based on past research. The models were then tested and evaluated on the basis of 4-year data (January 2010 to December 2013) collected in Malaysia. Results This research has two major contributions. A new risk factor, called the TempeRain factor (TRF), was identified and used as an input parameter for the model of dengue outbreak prediction. Moreover, TRF was applied to demonstrate its strong impact on dengue outbreaks. Experimental results showed that the Bayes Network model with the new meteorological risk factor identified in this study increased accuracy to 92.35% for predicting dengue outbreaks. Conclusions This research explored the factors used in dengue outbreak prediction systems. The major contribution of this study is identifying new significant factors that contribute to dengue outbreak prediction. From the evaluation result, we obtained a significant improvement in the accuracy of a machine learning model for dengue outbreak prediction.

Download Full-text

Identification of Significant Climatic Risk Factors and Machine Learning Models in Dengue Outbreak Prediction

10.21203/rs.2.15755/v6 ◽

2020 ◽

Author(s):

Felestin Yavari Nejad ◽

Kasturi Dewi Varathan

Keyword(s):

Machine Learning ◽

Risk Factor ◽

Dengue Fever ◽

Support Vector ◽

Learning Models ◽

Dengue Outbreak ◽

Bayes Network ◽

The World ◽

Dengue Outbreaks ◽

Machine Learning Models

Abstract Background: Dengue fever is a widespread viral disease and one of the world’s major pandemic vector-borne infections, causing serious hazard to humanity. The World Health Organisation (WHO) reported that the incidence of dengue fever has increased dramatically across the world in recent decades. WHO currently estimates an annual incidence of 50–100 million dengue infections worldwide. To date, no tested vaccine or treatment is available to stop or prevent dengue fever. Thus, the importance of predicting dengue outbreaks is significant. The current issue that should be addressed in dengue outbreak prediction is accuracy. A limited number of studies have conducted an in-depth analysis of climate factors in dengue outbreak prediction. Methods: The most important climatic factors that contribute to dengue outbreaks were identified in the current work. Correlation analyses were performed in order to determine these factors and these factors were used as input parameters for machine learning models. Top five machine learning classification models (Bayes network (BN) models, support vector machine (SVM), RBF tree, decision table and naive Bayes) were chosen based on past research. The models were then tested and evaluated on the basis of four-years data (January 2010 to December 2013) collected in Malaysia. Results: This research has two major contributions. A new risk factor, called the TempeRain Factor (TRF), was identified and used as an input parameter for the model of dengue outbreak prediction. Moreover, TRF was applied to demonstrate its strong impact on dengue outbreaks. Experimental results showed that the Bayes Network model with the new meteorological risk factor identified in this study increased accuracy to 92.35% and reduced the root-mean-square error to 0.26 for predicting dengue outbreaks. Conclusions: This research explored the factors used in dengue outbreak prediction systems. The major contribution of this study is identifying new significant factors that contribute to dengue outbreak prediction. From the evaluation result, we obtained a significant improvement in the accuracy of a machine learning model for dengue outbreak prediction.

Download Full-text

Identification of Significant Climatic Risk Factors and Machine Learning Models in Dengue Outbreak Prediction

10.21203/rs.2.15755/v1 ◽

2019 ◽

Cited By ~ 1

Author(s):

Felestin Yavari Nejad ◽

Kasturi Dewi Varathan

Keyword(s):

Machine Learning ◽

Risk Factor ◽

Dengue Fever ◽

World Health ◽

Support Vector ◽

Strong Impact ◽

Learning Models ◽

Dengue Outbreak ◽

Depth Analysis ◽

Machine Learning Models

Abstract Background: Dengue fever is a widespread viral disease and one of the world’s main pandemic vector-borne infections and serious hazard to humanity. According to the World Health Organization (WHO), the incidence of dengue has grown dramatically worldwide in recent decades. The WHO currently estimates an annual incidence of 50–100 million dengue infections worldwide. Until today there is no tested vaccine or treatment to stop or prevent dengue fever thus the importance of dengue outbreak prediction is significant. The current issue in dengue outbreak prediction is accuracy. There are a limited number of studies that look at in depth analysis of climate factors in dengue outbreak prediction. Methods: In this study, the most significant and important climatic factors that contribute to dengue outbreak were identified. These factors were used as input parameters on machine learning models. The models were trained and evaluated based on four-year data from January 2010 to December 2013 in Malaysia. Results: This work provides two main contributions. A new risk factor, which was called TempeRain Factor (TRF), was determined and used as an input parameter for dengue prediction outbreak model. Moreover, the TRF was applied to demonstrate that its strong impact on dengue outbreaks. Experimental results showed that Support Vector Machine (SVM) with the newly identified meteorological risk factor in this study resulted in higher accuracy of 98.09% and reduced the root mean square error to 0.098 for predicting dengue outbreak. Conclusions: This research managed to explore on the factors that are being used in dengue outbreak prediction systems. The main contribution of this paper is in identifying new significant factors that contribute in dengue outbreak prediction. From the evaluation, we managed to obtain a significant improvement in accuracy of the machine-learning model in dengue outbreak prediction.

Download Full-text

Identification of Significant Climatic Risk Factors and Machine Learning Models in Dengue Outbreak Prediction

10.21203/rs.2.15755/v4 ◽

2020 ◽

Author(s):

Felestin Yavari Nejad ◽

Kasturi Dewi Varathan

Keyword(s):

Machine Learning ◽

Risk Factor ◽

Dengue Fever ◽

Support Vector ◽

Learning Models ◽

Dengue Outbreak ◽

Bayes Network ◽

The World ◽

Dengue Outbreaks ◽

Machine Learning Models

Abstract Background: Dengue fever is a widespread viral disease and one of the world’s major pandemic vector-borne infections, causing serious hazard to humanity. The World Health Organisation (WHO) reported that the incidence of dengue fever has increased dramatically across the world in recent decades. WHO currently estimates an annual incidence of 50–100 million dengue infections worldwide. To date, no tested vaccine or treatment is available to stop or prevent dengue fever. Thus, the importance of predicting dengue outbreaks is significant. The current issue that should be addressed in dengue outbreak prediction is accuracy. A limited number of studies have conducted an in-depth analysis of climate factors in dengue outbreak prediction. Methods: The most important climatic factors that contribute to dengue outbreaks were identified in the current work. Correlation analyses were performed in order to determine these factors and these factors were used as input parameters for machine learning models. Top five machine learning classification models (Bayes network (BN) models, support vector machine (SVM), RBF tree, decision table and naive Bayes) were chosen based on past research. The models were then tested and evaluated on the basis of four-years data (January 2010 to December 2013) collected in Malaysia. Results: This research has two major contributions. A new risk factor, called the TempeRain Factor (TRF), was identified and used as an input parameter for the model of dengue outbreak prediction. Moreover, TRF was applied to demonstrate its strong impact on dengue outbreaks. Experimental results showed that the Bayes Network model with the new meteorological risk factor identified in this study increased accuracy to 92.35% and reduced the root-mean-square error to 0.26 for predicting dengue outbreaks. Conclusions: This research explored the factors used in dengue outbreak prediction systems. The major contribution of this study is identifying new significant factors that contribute to dengue outbreak prediction. From the evaluation result, we obtained a significant improvement in the accuracy of a machine learning model for dengue outbreak prediction.

Download Full-text

Identification of Significant Climatic Risk Factors and Machine Learning Models in Dengue Outbreak Prediction

10.21203/rs.2.15755/v5 ◽

2020 ◽

Author(s):

Felestin Yavari Nejad ◽

Kasturi Dewi Varathan

Keyword(s):

Machine Learning ◽

Risk Factor ◽

Dengue Fever ◽

Support Vector ◽

Learning Models ◽

Dengue Outbreak ◽

Bayes Network ◽

The World ◽

Dengue Outbreaks ◽

Machine Learning Models

Abstract Background: Dengue fever is a widespread viral disease and one of the world’s major pandemic vector-borne infections, causing serious hazard to humanity. The World Health Organisation (WHO) reported that the incidence of dengue fever has increased dramatically across the world in recent decades. WHO currently estimates an annual incidence of 50–100 million dengue infections worldwide. To date, no tested vaccine or treatment is available to stop or prevent dengue fever. Thus, the importance of predicting dengue outbreaks is significant. The current issue that should be addressed in dengue outbreak prediction is accuracy. A limited number of studies have conducted an in-depth analysis of climate factors in dengue outbreak prediction. Methods: The most important climatic factors that contribute to dengue outbreaks were identified in the current work. Correlation analyses were performed in order to determine these factors and these factors were used as input parameters for machine learning models. Top five machine learning classification models (Bayes network (BN) models, support vector machine (SVM), RBF tree, decision table and naive Bayes) were chosen based on past research. The models were then tested and evaluated on the basis of four-years data (January 2010 to December 2013) collected in Malaysia. Results: This research has two major contributions. A new risk factor, called the TempeRain Factor (TRF), was identified and used as an input parameter for the model of dengue outbreak prediction. Moreover, TRF was applied to demonstrate its strong impact on dengue outbreaks. Experimental results showed that the Bayes Network model with the new meteorological risk factor identified in this study increased accuracy to 92.35% and reduced the root-mean-square error to 0.26 for predicting dengue outbreaks. Conclusions: This research explored the factors used in dengue outbreak prediction systems. The major contribution of this study is identifying new significant factors that contribute to dengue outbreak prediction. From the evaluation result, we obtained a significant improvement in the accuracy of a machine learning model for dengue outbreak prediction.

Download Full-text

Identification of Significant Climatic Risk Factors and Machine Learning Models in Dengue Outbreak Prediction

10.21203/rs.2.15755/v3 ◽

2020 ◽

Author(s):

Felestin Yavari Nejad ◽

Kasturi Dewi Varathan

Keyword(s):

Dengue Fever ◽

Current Issue ◽

World Health Organisation ◽

Viral Disease ◽

World Health ◽

Dengue Outbreak ◽

Depth Analysis ◽

The World ◽

Dengue Outbreaks ◽

Vector Borne

Abstract Dengue fever is a widespread viral disease and one of the world’s major pandemic vector-borne infections, causing serious hazard to humanity. The World Health Organisation (WHO) reported that the incidence of dengue fever has increased dramatically across the world in recent decades. WHO currently estimates an annual incidence of 50–100 million dengue infections worldwide. To date, no tested vaccine or treatment is available to stop or prevent dengue fever. Thus, the importance of predicting dengue outbreaks is significant. The current issue that should be addressed in dengue outbreak prediction is accuracy. A limited number of studies have conducted an in-depth analysis of climate factors in dengue outbreak prediction.

Download Full-text

Predicting the Linguistic Accessibility of Chinese Health Translations: Using Machine Learning Algorithms (Preprint)

10.2196/preprints.30588 ◽

2021 ◽

Author(s):

Meng Ji ◽

Pierrette Bouillon

Keyword(s):

Machine Learning ◽

Random Forest ◽

Health Resources ◽

Machine Learning Algorithms ◽

World Health ◽

Learning Models ◽

The World ◽

Health Organization ◽

C5.0 Decision Tree ◽

Machine Learning Models

BACKGROUND Linguistic accessibility has important impact on the reception and utilization of translated health resources among multicultural and multilingual populations. Linguistic understandability of health translation has been under-studied. OBJECTIVE Our study aimed to develop novel machine learning models for the study of the linguistic accessibility of health translations comparing Chinese translations of the World Health Organization health materials with original Chinese health resources developed by the Chinese health authorities. METHODS Using natural language processing tools for the assessment of the readability of Chinese materials, we explored and compared the readability of Chinese health translations from the World Health Organization with original Chinese materials from China Centre for Disease Control and Prevention. RESULTS Pairwise adjusted t test showed that three new machine learning models achieved statistically significant improvement over the baseline logistic regression in terms of AUC: C5.0 decision tree (p=0.000, 95% CI: -0.249, -0.152), random forest (p=0.000, 95% CI: 0.139, 0.239) and XGBoost Tree (p=0.000, 95% CI: 0.099, 0.193). There was however no significant difference between C5.0 decision tree and random forest (p=0.513). Extreme gradient boost tree was the best model having achieved statistically significant improvement over the C5.0 model (p=0.003) and the Random Forest model (p=0.006) at the adjusted Bonferroni p value at 0.008. CONCLUSIONS The development of machine learning algorithms significantly improved the accuracy and reliability of current approaches to the evaluation of the linguistic accessibility of Chinese health information, especially Chinese health translations in relation to original health resources. Although the new algorithms developed were based on Chinese health resources, they can be adapted for other languages to advance current research in accessible health translation, communication, and promotion.

Download Full-text

Predicting the Linguistic Accessibility of Chinese Health Translations: Using Machine Learning Algorithms (Preprint)

10.2196/preprints.30589 ◽

2021 ◽

Author(s):

Christine Ji

Keyword(s):

Machine Learning ◽

Random Forest ◽

Health Resources ◽

World Health Organisation ◽

Machine Learning Algorithms ◽

World Health ◽

Learning Models ◽

The World ◽

C5.0 Decision Tree ◽

Machine Learning Models

BACKGROUND Linguistic accessibility has important impact on the reception and utilisation of translated health resources among multicultural and multilingual populations. Linguistic understandability of health translation has been under-studied. OBJECTIVE Our study aimed to develop novel machine learning models for the study of the linguistic accessibility of health translations comparing Chinese translations of the World Health Organisation health materials with original Chinese health resources developed by the Chinese health authorities. METHODS Using natural language processing tools for the assessment of the readability of Chinese materials, we explored and compared the readability of Chinese health translations from the World Health Organisation with original Chinese materials from China Centre for Disease Control and Prevention. RESULTS Pairwise adjusted t test showed that three new machine learning models achieved statistically significant improvement over the baseline logistic regression in terms of AUC: C5.0 decision tree (p=0.000, 95% CI: -0.249, -0.152), random forest (p=0.000, 95% CI: 0.139, 0.239) and XGBoost Tree (p=0.000, 95% CI: 0.099, 0.193). There was however no significant difference between C5.0 decision tree and random forest (p=0.513). Extreme gradient boost tree was the best model having achieved statistically significant improvement over the C5.0 model (p=0.003) and the Random Forest model (p=0.006) at the adjusted Bonferroni p value at 0.008. CONCLUSIONS The development of machine learning algorithms significantly improved the accuracy and reliability of current approaches to the evaluation of the linguistic accessibility of Chinese health information, especially Chinese health translations in relation to original health resources. Although the new algorithms developed were based on Chinese health resources, they can be adapted for other languages to advance current research in accessible health translation, communication, and promotion.

Download Full-text

Using Machine Learning Methods Incorporating Individual Reader Annotations to Classify Paediatric Chest Radiographs in Epidemiological Studies

Wellcome Open Research ◽

10.12688/wellcomeopenres.17164.1 ◽

2021 ◽

Vol 6 ◽

pp. 309

Author(s):

Paul Mwaniki ◽

Timothy Kamanu ◽

Samuel Akech ◽

M. J. C Eijkemans

Keyword(s):

Machine Learning ◽

Epidemiological Studies ◽

Chest Radiographs ◽

World Health ◽

Data Sets ◽

Learning Models ◽

Middle Income ◽

Training Models ◽

Model Training ◽

Machine Learning Models

Introduction: Epidemiological studies that involve interpretation of chest radiographs (CXRs) suffer from inter-reader and intra-reader variability. Inter-reader and intra-reader variability hinder comparison of results from different studies or centres, which negatively affects efforts to track the burden of chest diseases or evaluate the efficacy of interventions such as vaccines. This study explores machine learning models that could standardize interpretation of CXR across studies and the utility of incorporating individual reader annotations when training models using CXR data sets annotated by multiple readers. Methods: Convolutional neural networks were used to classify CXRs from seven low to middle-income countries into five categories according to the World Health Organization's standardized methodology for interpreting paediatric CXRs. We compared models trained to predict the final/aggregate classification with models trained to predict how each reader would classify an image and then aggregate predictions for all readers using unweighted mean. Results: Incorporating individual reader's annotations during model training improved classification accuracy by 3.4% (multi-class accuracy 61% vs 59%). Model accuracy was higher for children above 12 months of age (68% vs 58%). The accuracy of the models in different countries ranged between 45% and 71%. Conclusions: Machine learning models can annotate CXRs in epidemiological studies reducing inter-reader and intra-reader variability. In addition, incorporating individual reader annotations can improve the performance of machine learning models trained using CXRs annotated by multiple readers.

Download Full-text

Extra Point Under Review: Machine Learning And The NFL Field Goal

Elements ◽

10.6017/eurj.v12i2.9448 ◽

2016 ◽

Vol 12 (2) ◽

Author(s):

James LeDoux

Keyword(s):

Machine Learning ◽

Predictive Model ◽

Learning Models ◽

The World ◽

Extra Point ◽

Machine Learning Models

<p>The new NFL extra point rule first implemented in the 2015 season requires a kicker to attempt his extra point with the ball snapped from the 15-yard line. This attempt stretches an extra point to the equivalent of a 32-yard field goal attempt, 13 yards longer than under the previous rule. Though a 32-yard attempt is still a chip shot to any professional kicker, many NFL analysts were surprised to see the number of extra points that were missed. Should this really have been a surprise, though? Beginning with a replication of a study by Clark et. al, this study aims to explore the world of NFL kicking from a statistical perspective, applying econometric and machine learning models to display a deeper perspective on what exactly makes some field goal attempts more difficult than others. Ultimately, the goal is to go beyond the previous research on this topic, providing an improved predictive model of field goal success and a better metric for evaluating placekicker ability.</p>

Download Full-text