scholarly journals A Multivariate Multi-Step LSTM Forecasting Model For Tuberculosis Incidence With Model Explanation In Liaoning Province, China

Author(s):  
Enbin Yang ◽  
Hao Zhang ◽  
Xinsheng Guo ◽  
Zinan Zang ◽  
Zhen Liu ◽  
...  

Abstract Background: In addition to COVID-19, tuberculosis (TB) is the respiratory infectious disease with the highest incidence in China. We aim to design a series of forecasting models and find the factors that affect the incidence of TB, thereby improving the accuracy of the incidence prediction. Results: In this paper, we developed a new interpretable prediction system based on the multivariate multi-step Long Short-Term Memory (LSTM) model and SHapley Additive exPlanation (SHAP) method. Moreover, four accuracy measures are introduced into the system: Root Mean Square Error, Mean Absolute Error, Mean Absolute Percentage Error, and symmetric Mean Absolute Percentage Error. Meanwhile, the Autoregressive Integrated Moving Average (ARIMA) model and seasonal ARIMA model are established. The multi-step ARIMA-LSTM model is proposed for the first time to examine the performance of each model in the short, medium, and long term, respectively. Compared with the ARIMA model, each error of the multivariate 2-step LSTM model is reduced by 12.92%, 15.94%, 15.97%, and 14.81% in the short term. The 3-step ARIMA-LSTM model achieved excellent performance, with each error decreased to 15.19%, 33.14%, 36.79%, and 29.76% in the medium and long term. We provide the local and global explanation of the multivariate single-step LSTM model in the field of incidence prediction, pioneering. Conclusions: The multivariate 2-step LSTM model is suitable for short-term forecasts, and the 3-step ARIMA-LSTM model is appropriate for medium and long-term forecasts. In addition, the prediction effect was better than similar TB incidence forecasting models. The SHAP results indicate that the five most crucial features are maximum temperature, average relative humidity, local financial budget, monthly sunshine percentage, and sunshine hours.

2021 ◽  
Author(s):  
Enbin Yang ◽  
Hao Zhang ◽  
Xinsheng Guo ◽  
Zhen Liu ◽  
Yuanning Liu

Abstract Purpose: In addition to COVID-19, tuberculosis (TB) is the respiratory infectious disease with the highest incidence in China. We aim to design a series of forecasting models and find the factors that affect the incidence of TB, thereby improving the accuracy of the incidence prediction. Results: In this paper, we developed a new interpretable prediction system based on the multivariate multi-step Long Short-Term Memory (LSTM) model and SHapley Additive exPlanation (SHAP) method. Moreover, four accuracy measures are introduced into the system: Root Mean Square Error, Mean Absolute Error, Mean Absolute Percentage Error, and symmetric Mean Absolute Percentage Error. Meanwhile, the Autoregressive Integrated Moving Average (ARIMA) model and seasonal ARIMA model are established. The multi-step ARIMA-LSTM model is proposed for the first time to examine the performance of each model in the short, medium, and long term, respectively. Compared with the ARIMA model, each error of the multivariate 2-step LSTM model is reduced by 12.92%, 15.94%, 15.97%, and 14.81% in the short term. The 3-step ARIMA-LSTM model achieved excellent performance, with each error decreased to 15.19%, 33.14%, 36.79%, and 29.76% in the medium and long term. We provide the local and global explanation of the multivariate single-step LSTM model in the field of incidence prediction, pioneering. Conclusion: The multivariate 2-step LSTM model is suitable for short-term forecasts, and the 3-step ARIMA-LSTM model is appropriate for medium- and long-term forecasts. In addition, the prediction effect was better than similar TB incidence forecasting models. The SHAP results indicate that the five most crucial features are maximum temperature, average relative humidity, local financial budget, monthly sunshine percentage, and sunshine hours.


2021 ◽  
pp. 1-13
Author(s):  
Muhammad Rafi ◽  
Mohammad Taha Wahab ◽  
Muhammad Bilal Khan ◽  
Hani Raza

Automatic Teller Machine (ATM) are still largely used to dispense cash to the customers. ATM cash replenishment is a process of refilling ATM machine with a specific amount of cash. Due to vacillating users demands and seasonal patterns, it is a very challenging problem for the financial institutions to keep the optimal amount of cash for each ATM. In this paper, we present a time series model based on Auto Regressive Integrated Moving Average (ARIMA) technique called Time Series ARIMA Model for ATM (TASM4ATM). This study used ATM back-end refilling historical data from 6 different financial organizations in Pakistan. There are 2040 distinct ATMs and 18 month of replenishment data from these ATMs are used to train the proposed model. The model is compared with the state-of- the-art models like Recurrent Neural Network (RNN) and Amazon’s DeepAR model. Two approaches are used for forecasting (i) Single ATM and (ii) clusters of ATMs (In which ATMs are clustered with similar cash-demands). The Mean Absolute Percentage Error (MAPE) and Symmetric Mean Absolute Percentage Error (SMAPE) are used to evaluate the models. The suggested model produces far better forecasting as compared to the models in comparison and produced an average of 7.86/7.99 values for MAPE/SMAPE errors on individual ATMs and average of 6.57/6.64 values for MAPE/SMAPE errors on clusters of ATMs.


Jurnal Varian ◽  
2020 ◽  
Vol 3 (2) ◽  
pp. 113-124
Author(s):  
Ulil Azmi ◽  
Wawan Hafid Syaifudin

Emas, Tembaga dan Minyak merupakan jenis komoditas yang banyak diincar oleh para investor untuk menanamkan modal dengan cara melakukan investasi pada jenis komoditas tersebut. Prediksi harga komoditas sangat bermanfaat bagi investor untuk melihat prospek investasi komoditas pada suatu perusahaan di masa yang akan datang. Harga komoditas memiliki karakteristik data yang tidak stabil atau sering disebut volatilitas. Untuk mengatasi permasalahan tersebut, dilakukan peramalan dengan metode ARIMA dan ARIMA-GARCH. Dipilih dua metode tersebut karena dua metode ini cocok untuk meramalkan sesuatu yang memiliki data history yang kuat. Metode ARIMA ARCH-GARCH lebih cocok digunakan untuk data-data yang memliki volatilitas yang tinggi atau terdapat heteroskedastisitas pada residual data, sehingga hasil prediksi lebih akurat. Hal ini dibuktikan dengan nilai AIC lebih kecil dari pada hanya menggunakan metode ARIMA. Model terbaik untuk komoditas Emas adalah ARIMA(0,1,1) – GARCH(1,1) sedangkan komoditas tembaga memiliki model terbaik yaitu ARIMA(2,1,2) – GARCH(1,1) dan komoditas minyak yaitu ARIMA(1,1,1) – GARCH(0,1). Nilai MAPE (Mean Absolute Percentage Error) untuk masing-masing komoditas berturut-turut adalah 1,113; 0,542 dan 1,158 untuk Emas, Tembaga dan Minyak.


2020 ◽  
Vol 6 (10) ◽  
pp. FSO634
Author(s):  
George E Saulnier ◽  
Janna C Castro ◽  
Curtiss B Cook

Aim: Evaluate forecasting models applied to smaller geographic locations within the hospital. Materials & methods: Damped trend models were applied to blood glucose measurements of progressively smaller inpatient geographic subpopulations. Mean absolute percentage error (MAPE) and 95% prediction intervals (PIs) assessed validity of the models to forecasts 48 weeks into the future. Results: MAPE values increased, and 95% PIs widened, when data from progressively smaller geographic areas were analyzed. MAPE values were highest and 95% PIs were broadest with the smallest geographic areas. In contrast, observations missed at larger geographical locations were more evident with smaller subpopulations. Conclusion: The utility of damped trend models to forecast inpatient glucose control diminished when applied to smaller geographic areas within the hospital.


2019 ◽  
Vol 11 (1) ◽  
pp. 6-10
Author(s):  
Michael Saputra Suryono ◽  
Raymond Oetama

Forex or Foreign Exchange is trading a country's currency with another country's currency. The purpose of this study is basically to test the accuracy of ARIMA on the GBP/USD currency pair. In addition, this research is expected to provide the benefits of knowledge about forecasting using ARIMA. This study resulted in forecasting the GBP/USD currency pair within 1 month, per 6 months from January 2018 to June 2018 using the ARIMA method and R software. Data to be used are data taken from January 2013 to June 2018. For the the process will follow the process of the KDD (Knowledge Discovery in Database). The results obtained by the ARIMA model (3,2,1) as the best model to be applied for 1 month per 6 months on the GBP/USD currency pair because it has the lowest AIC value and the mean absolute percentage error is 3.16%.


Author(s):  
Ruby Mae Ebuna Maliberan

The study attempted to forecast the number of tourist arrival in the province of Surigao del Sur using the historical monthly tourist arrival data from 2012-2016 using three time series. Findings showed that the tourist arrival in the province is likely to be increasing. As more foreign and local tourist arrivals are expected as a result of forecast model. Furthermore, it showed that there was a long term increasing trend of the tourist arrival in the province. Results revealed that the Mean Absolute Percentage Error (MAPE) of the forecasted tourist arrival data yielded an error of 11 % which means that predicted data is closer to the actual data. Based on the findings of the study, the researcher recommends that this study can be adapted by other Tourism Office of CARAGA, Philippines. 


2016 ◽  
Vol 39 ◽  
Author(s):  
Mary C. Potter

AbstractRapid serial visual presentation (RSVP) of words or pictured scenes provides evidence for a large-capacity conceptual short-term memory (CSTM) that momentarily provides rich associated material from long-term memory, permitting rapid chunking (Potter 1993; 2009; 2012). In perception of scenes as well as language comprehension, we make use of knowledge that briefly exceeds the supposed limits of working memory.


2020 ◽  
Vol 29 (4) ◽  
pp. 710-727
Author(s):  
Beula M. Magimairaj ◽  
Naveen K. Nagaraj ◽  
Alexander V. Sergeev ◽  
Natalie J. Benafield

Objectives School-age children with and without parent-reported listening difficulties (LiD) were compared on auditory processing, language, memory, and attention abilities. The objective was to extend what is known so far in the literature about children with LiD by using multiple measures and selective novel measures across the above areas. Design Twenty-six children who were reported by their parents as having LiD and 26 age-matched typically developing children completed clinical tests of auditory processing and multiple measures of language, attention, and memory. All children had normal-range pure-tone hearing thresholds bilaterally. Group differences were examined. Results In addition to significantly poorer speech-perception-in-noise scores, children with LiD had reduced speed and accuracy of word retrieval from long-term memory, poorer short-term memory, sentence recall, and inferencing ability. Statistically significant group differences were of moderate effect size; however, standard test scores of children with LiD were not clinically poor. No statistically significant group differences were observed in attention, working memory capacity, vocabulary, and nonverbal IQ. Conclusions Mild signal-to-noise ratio loss, as reflected by the group mean of children with LiD, supported the children's functional listening problems. In addition, children's relative weakness in select areas of language performance, short-term memory, and long-term memory lexical retrieval speed and accuracy added to previous research on evidence-based areas that need to be evaluated in children with LiD who almost always have heterogenous profiles. Importantly, the functional difficulties faced by children with LiD in relation to their test results indicated, to some extent, that commonly used assessments may not be adequately capturing the children's listening challenges. Supplemental Material https://doi.org/10.23641/asha.12808607


Energies ◽  
2021 ◽  
Vol 14 (11) ◽  
pp. 3299
Author(s):  
Ashish Shrestha ◽  
Bishal Ghimire ◽  
Francisco Gonzalez-Longatt

Withthe massive penetration of electronic power converter (EPC)-based technologies, numerous issues are being noticed in the modern power system that may directly affect system dynamics and operational security. The estimation of system performance parameters is especially important for transmission system operators (TSOs) in order to operate a power system securely. This paper presents a Bayesian model to forecast short-term kinetic energy time series data for a power system, which can thus help TSOs to operate a respective power system securely. A Markov chain Monte Carlo (MCMC) method used as a No-U-Turn sampler and Stan’s limited-memory Broyden–Fletcher–Goldfarb–Shanno (LM-BFGS) algorithm is used as the optimization method here. The concept of decomposable time series modeling is adopted to analyze the seasonal characteristics of datasets, and numerous performance measurement matrices are used for model validation. Besides, an autoregressive integrated moving average (ARIMA) model is used to compare the results of the presented model. At last, the optimal size of the training dataset is identified, which is required to forecast the 30-min values of the kinetic energy with a low error. In this study, one-year univariate data (1-min resolution) for the integrated Nordic power system (INPS) are used to forecast the kinetic energy for sequences of 30 min (i.e., short-term sequences). Performance evaluation metrics such as the root-mean-square error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), and mean absolute scaled error (MASE) of the proposed model are calculated here to be 4.67, 3.865, 0.048, and 8.15, respectively. In addition, the performance matrices can be improved by up to 3.28, 2.67, 0.034, and 5.62, respectively, by increasing MCMC sampling. Similarly, 180.5 h of historic data is sufficient to forecast short-term results for the case study here with an accuracy of 1.54504 for the RMSE.


Sign in / Sign up

Export Citation Format

Share Document