scholarly journals Machine learning model for predicting out-of-hospital cardiac arrests using meteorological and chronological data

Heart ◽  
2021 ◽  
pp. heartjnl-2020-318726 ◽  
Author(s):  
Takahiro Nakashima ◽  
Soshiro Ogata ◽  
Teruo Noguchi ◽  
Yoshio Tahara ◽  
Daisuke Onozuka ◽  
...  

ObjectivesTo evaluate a predictive model for robust estimation of daily out-of-hospital cardiac arrest (OHCA) incidence using a suite of machine learning (ML) approaches and high-resolution meteorological and chronological data.MethodsIn this population-based study, we combined an OHCA nationwide registry and high-resolution meteorological and chronological datasets from Japan. We developed a model to predict daily OHCA incidence with a training dataset for 2005–2013 using the eXtreme Gradient Boosting algorithm. A dataset for 2014–2015 was used to test the predictive model. The main outcome was the accuracy of the predictive model for the number of daily OHCA events, based on mean absolute error (MAE) and mean absolute percentage error (MAPE). In general, a model with MAPE less than 10% is considered highly accurate.ResultsAmong the 1 299 784 OHCA cases, 661 052 OHCA cases of cardiac origin (525 374 cases in the training dataset on which fourfold cross-validation was performed and 135 678 cases in the testing dataset) were included in the analysis. Compared with the ML models using meteorological or chronological variables alone, the ML model with combined meteorological and chronological variables had the highest predictive accuracy in the training (MAE 1.314 and MAPE 7.007%) and testing datasets (MAE 1.547 and MAPE 7.788%). Sunday, Monday, holiday, winter, low ambient temperature and large interday or intraday temperature difference were more strongly associated with OHCA incidence than other the meteorological and chronological variables.ConclusionsA ML predictive model using comprehensive daily meteorological and chronological data allows for highly precise estimates of OHCA incidence.

2020 ◽  
Author(s):  
Ching-Chieh Huang ◽  
Jesyin Lai ◽  
Der-Yang Cho ◽  
Jiaxin Yu

Abstract Since the emergence of COVID-19, many hospitals have encountered challenges in performing efficient scheduling and good resource management to ensure the quality of healthcare provided to patients is not compromised. Operating room (OR) scheduling is one of the issues that has gained our attention because it is related to workflow efficiency and critical care of hospitals. Automatic scheduling and high predictive accuracy of surgical case duration have a critical role in improving OR utilization. To estimate surgical case duration, many hospitals rely on historic averages based on a specific surgeon or a specific procedure type obtained from electronic medical record (EMR) scheduling systems. However, the low predictive accuracy with EMR data leads to negative impacts on patients and hospitals, such as rescheduling of surgeries and cancellation. In this study, we aim to improve the prediction of surgical case duration with advanced machine learning (ML) algorithms. We obtained a large data set containing 170,748 surgical cases (from Jan 2017 to Dec 2019) from a hospital. The data covered a broad variety of details on patients, surgeries, specialties and surgical teams. In addition, a more recent data set with 8,672 cases (from Mar to Apr 2020) was available to be used for external evaluation. We computed historic averages from the EMR data for surgeon- or procedure-specific cases, and they were used as baseline models for comparison. Subsequently, we developed our models using linear regression, random forest and extreme gradient boosting (XGB) algorithms. All models were evaluated with R-square (R2), mean absolute error (MAE), and percentage overage (actual duration longer than prediction), underage (shorter than prediction) and within (within prediction). The XGB model was superior to the other models, achieving a higher R2 (85 %) and percentage within (48 %) as well as a lower MAE (30.2 min). The total prediction errors computed for all models showed that the XGB model had the lowest inaccurate percentage (23.7 %). Overall, this study applied ML techniques in the field of OR scheduling to reduce the medical and financial burden for healthcare management. The results revealed the importance of surgery and surgeon factors in surgical case duration prediction. This study also demonstrated the importance of performing an external evaluation to better validate the performance of ML models.


2021 ◽  
Vol 5 (2) ◽  
pp. 377-395
Author(s):  
Iqbal Hanif ◽  
Regita Fachri Septiani

Rating is one of the most frequently used metrics in the television industry to evaluate television programs or channels. This research is an attempt to develop a prediction model of television program ratings using rating data gathered from UseeTV (interned-based television service from Telkom Indonesia). The machine learning methods (Random Forest and Extreme Gradient Boosting) were tried out utilizing a set of rating data from 20 television programs collected from January 2018 to August 2019 (train dataset) and evaluated using September 2019 rating data (test dataset). Research results show that Random Forest gives a better result than Extreme Gradient Boosting based on evaluation metrics: Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE). On the training dataset, prediction using Random Forest produced lower RMSE and MAE scores than Extreme Gradient Boosting in all programs, while on the testing dataset, Random Forest produced lower RMSE and MAE scores in 16 programs compared with Extreme Gradient Boosting. According to MAPE score, Random Forest produced more good quality prediction (4 programs in the training dataset, 16 programs in the testing dataset) than Extreme Gradient Boosting method (1 program in the training dataset, 12 programs in the testing dataset) both in training and testing dataset.


2021 ◽  
Author(s):  
William Lidberg ◽  
Johannes Larson ◽  
siddhartho Paul ◽  
Hjalmar Laudon ◽  
Anneli Ågren

<p>Open peatlands are a recognizable feature in the boreal landscape that are commonly mapped from aerial photographs. However, wet soils also occur on tree covered peatlands and in the riparian zones of forest streams and surrounding lakes. Comparisons between field data and available maps show that only 36 % of wet soils in the boreal landscape are marked on maps, making them difficult to manage. Wet soils have lower bearing capacity than dry soils and are more susceptible to soil disturbance from land-use management with heavy machinery. Topographical modelling of wet area indices has been suggested as a solution to this problem and high-resolution digital elevation models (DEM) derived from airborne LiDAR are becoming accessible in many countries. However, most of these topographical methods relies on the user to define appropriate threshold values in order to define wet areas. Soil textures, topography and climatic differences make any application difficult on a large scale. This complex landscape variability can be captured by utilizing machine learners that uses automated data mining methods to discover patterns in large data sets. By using soil moisture data from 20 000 field plots from the National Forest Inventory of Sweden, we combined information from 24 indices and ancillary environmental features using a machine learning known as extreme gradient boosting. Extreme gradient boosting used the field data to learn how to classify soil moisture and delivered high performance compared to many traditional single algorithm methods. With this method we mapped soil moisture at 2 m spatial resolution across the Swedish forest landscape in five days using a workstation with 32 cores. This new map captured 79 % (kappa 0.69) of all wet soils compared to only 36 % (kappa 0.39) captured by current maps. In addition to capture open wetlands this new map also capture riparian zones and previously unmapped cryptic wetlands underneath the forest canopy. The new maps can, for example, be used to plan hydrologically adapted buffer zones, suggest machine free zones near streams and lakes in order to prevent rutting from forestry machines to reduce sediment, mercury and nutrient loads to downstream streams, lakes and sea.</p>


2020 ◽  
Vol 10 (24) ◽  
pp. 8968
Author(s):  
Miguel Martínez-Comesaña ◽  
Lara Febrero-Garrido ◽  
Enrique Granada-Álvarez ◽  
Javier Martínez-Torres ◽  
Sandra Martínez-Mariño

The Heat Loss Coefficient (HLC) characterizes the envelope efficiency of a building under in-use conditions, and it represents one of the main causes of the performance gap between the building design and its real operation. Accurate estimations of the HLC contribute to optimizing the energy consumption of a building. In this context, the application of black-box models in building energy analysis has been consolidated in recent years. The aim of this paper is to estimate the HLC of an existing building through the prediction of building thermal demands using a methodology based on Machine Learning (ML) models. Specifically, three different ML methods are applied to a public library in the northwest of Spain and compared; eXtreme Gradient Boosting (XGBoost), Support Vector Regression (SVR) and Multi-Layer Perceptron (MLP) neural network. Furthermore, the accuracy of the results is measured, on the one hand, using both CV(RMSE) and Normalized Mean Biased Error (NMBE), as advised by AHSRAE, for thermal demand predictions and, on the other, an absolute error for HLC estimations. The main novelty of this paper lies in the estimation of the HLC of a building considering thermal demand predictions reducing the requirement for monitoring. The results show that the most accurate model is capable of estimating the HLC of the building with an absolute error between 4 and 6%.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Cindy Feng ◽  
George Kephart ◽  
Elizabeth Juarez-Colunga

Abstract Background Coronavirus disease (COVID-19) presents an unprecedented threat to global health worldwide. Accurately predicting the mortality risk among the infected individuals is crucial for prioritizing medical care and mitigating the healthcare system’s burden. The present study aimed to assess the predictive accuracy of machine learning methods to predict the COVID-19 mortality risk. Methods We compared the performance of classification tree, random forest (RF), extreme gradient boosting (XGBoost), logistic regression, generalized additive model (GAM) and linear discriminant analysis (LDA) to predict the mortality risk among 49,216 COVID-19 positive cases in Toronto, Canada, reported from March 1 to December 10, 2020. We used repeated split-sample validation and k-steps-ahead forecasting validation. Predictive models were estimated using training samples, and predictive accuracy of the methods for the testing samples was assessed using the area under the receiver operating characteristic curve, Brier’s score, calibration intercept and calibration slope. Results We found XGBoost is highly discriminative, with an AUC of 0.9669 and has superior performance over conventional tree-based methods, i.e., classification tree or RF methods for predicting COVID-19 mortality risk. Regression-based methods (logistic, GAM and LASSO) had comparable performance to the XGBoost with slightly lower AUCs and higher Brier’s scores. Conclusions XGBoost offers superior performance over conventional tree-based methods and minor improvement over regression-based methods for predicting COVID-19 mortality risk in the study population.


Author(s):  
Samir Bandyopadhyay ◽  
Shawni Dutta ◽  
Upasana Mukherjee

The novel coronavirus disease (COVID-19) has created immense threats to public health on various levels around the globe. The unpredictable outbreak of this disease and the pandemic situation are causing severe depression, anxiety and other mental as physical health related problems among the human beings. To combat against this disease, vaccination is essential as it will boost the immune system of human beings while being in the contact with the infected people. The vaccination process is thus necessary to confront the outbreak of COVID-19. This deadly disease has put social, economic condition of the entire world into an enormous challenge. The worldwide vaccination progress should be tracked to identify how fast the entire economic as well as social life will be stabilized. The monitor ofthe vaccination progress, a machine learning based Regressor model is approached in this study. This tracking process has been applied on the data starting from 14th December, 2020 to 24th April, 2021. A couple of ensemble based machine learning Regressor models such as Random Forest, Extra Trees, Gradient Boosting, AdaBoost and Extreme Gradient Boosting are implemented and their predictive performance are compared. The comparative study reveals that the AdaBoostRegressor outperforms with minimized mean absolute error (MAE) of 9.968 and root mean squared error (RMSE) of 11.133.


2021 ◽  
pp. 0958305X2110449
Author(s):  
Irfan Ullah ◽  
Kai Liu ◽  
Toshiyuki Yamamoto ◽  
Rabia Emhamed Al Mamlook ◽  
Arshad Jamal

The rapid growth of transportation sector and related emissions are attracting the attention of policymakers to ensure environmental sustainability. Therefore, the deriving factors of transport emissions are extremely important to comprehend. The role of electric vehicles is imperative amid rising transport emissions. Electric vehicles pave the way towards a low-carbon economy and sustainable environment. Successful deployment of electric vehicles relies heavily on energy consumption models that can predict energy consumption efficiently and reliably. Improving electric vehicles’ energy consumption efficiency will significantly help to alleviate driver anxiety and provide an essential framework for operation, planning, and management of the charging infrastructure. To tackle the challenge of electric vehicles’ energy consumption prediction, this study aims to employ advanced machine learning models, extreme gradient boosting, and light gradient boosting machine to compare with traditional machine learning models, multiple linear regression, and artificial neural network. Electric vehicles energy consumption data in the analysis were collected in Aichi Prefecture, Japan. To evaluate the performance of the prediction models, three evaluation metrics were used; coefficient of determination ( R2), root mean square error, and mean absolute error. The prediction outcome exhibits that the extreme gradient boosting and light gradient boosting machine provided better and robust results compared to multiple linear regression and artificial neural network. The models based on extreme gradient boosting and light gradient boosting machine yielded higher values of R2, lower mean absolute error, and root mean square error values have proven to be more accurate. However, the results demonstrated that the light gradient boosting machine is outperformed the extreme gradient boosting model. A detailed feature important analysis was carried out to demonstrate the impact and relative influence of different input variables on electric vehicles energy consumption prediction. The results imply that an advanced machine learning model can enhance the prediction performance of electric vehicles energy consumption.


Author(s):  
Fernando A. Acosta Pérez ◽  
Gabriel E. Rodríguez Ortiz ◽  
Everson Rodríguez Muñiz ◽  
Fernando J. Ortiz Sacarello ◽  
Jee Eun Kang ◽  
...  

The productivity of paratransit systems could be improved if transit agencies had the tools to accurately predict which trip reservations are likely to result in trips. A potentially useful approach to this prediction task is the use of machine learning algorithms, which are routinely applied in, for example, the airline and hotel industries to make predictions on reservation outcomes. In this study, the application of machine learning (ML) algorithms is examined for two prediction problems that are of interest to paratransit operations. In the first problem the operator is only concerned with predicting which reservations will result in trips and which ones will not, while in the second prediction problem the operator is interested in more than two reservation outcomes. Logistic regression, random forest, gradient boosting, and extreme gradient boosting were the main machine learning algorithms applied in this study. In addition, a clustering-based approach was developed to assign outcome probabilities to trip reservations. Using trip reservation data provided by the Metropolitan Bus Authority of Puerto Rico, tests were conducted to examine the predictive accuracy of the selected algorithms. The gradient boosting and extreme gradient boosting algorithms were the best performing methods in the classification tests. In addition, to illustrate an application of the algorithms, demand forecasting models were generated and shown to be a promising approach for predicting daily trips in paratransit systems. The best performing method in this exercise was a regression model that optimally combined the demand predictions generated by the machine learning algorithms considered in this study.


2019 ◽  
Author(s):  
Kasper Van Mens ◽  
Joran Lokkerbol ◽  
Richard Janssen ◽  
Robert de Lange ◽  
Bea Tiemens

BACKGROUND It remains a challenge to predict which treatment will work for which patient in mental healthcare. OBJECTIVE In this study we compare machine algorithms to predict during treatment which patients will not benefit from brief mental health treatment and present trade-offs that must be considered before an algorithm can be used in clinical practice. METHODS Using an anonymized dataset containing routine outcome monitoring data from a mental healthcare organization in the Netherlands (n = 2,655), we applied three machine learning algorithms to predict treatment outcome. The algorithms were internally validated with cross-validation on a training sample (n = 1,860) and externally validated on an unseen test sample (n = 795). RESULTS The performance of the three algorithms did not significantly differ on the test set. With a default classification cut-off at 0.5 predicted probability, the extreme gradient boosting algorithm showed the highest positive predictive value (ppv) of 0.71(0.61 – 0.77) with a sensitivity of 0.35 (0.29 – 0.41) and area under the curve of 0.78. A trade-off can be made between ppv and sensitivity by choosing different cut-off probabilities. With a cut-off at 0.63, the ppv increased to 0.87 and the sensitivity dropped to 0.17. With a cut-off of at 0.38, the ppv decreased to 0.61 and the sensitivity increased to 0.57. CONCLUSIONS Machine learning can be used to predict treatment outcomes based on routine monitoring data.This allows practitioners to choose their own trade-off between being selective and more certain versus inclusive and less certain.


2021 ◽  
Vol 13 (5) ◽  
pp. 1021
Author(s):  
Hu Ding ◽  
Jiaming Na ◽  
Shangjing Jiang ◽  
Jie Zhu ◽  
Kai Liu ◽  
...  

Artificial terraces are of great importance for agricultural production and soil and water conservation. Automatic high-accuracy mapping of artificial terraces is the basis of monitoring and related studies. Previous research achieved artificial terrace mapping based on high-resolution digital elevation models (DEMs) or imagery. As a result of the importance of the contextual information for terrace mapping, object-based image analysis (OBIA) combined with machine learning (ML) technologies are widely used. However, the selection of an appropriate classifier is of great importance for the terrace mapping task. In this study, the performance of an integrated framework using OBIA and ML for terrace mapping was tested. A catchment, Zhifanggou, in the Loess Plateau, China, was used as the study area. First, optimized image segmentation was conducted. Then, features from the DEMs and imagery were extracted, and the correlations between the features were analyzed and ranked for classification. Finally, three different commonly-used ML classifiers, namely, extreme gradient boosting (XGBoost), random forest (RF), and k-nearest neighbor (KNN), were used for terrace mapping. The comparison with the ground truth, as delineated by field survey, indicated that random forest performed best, with a 95.60% overall accuracy (followed by 94.16% and 92.33% for XGBoost and KNN, respectively). The influence of class imbalance and feature selection is discussed. This work provides a credible framework for mapping artificial terraces.


Sign in / Sign up

Export Citation Format

Share Document