scholarly journals Similarity-based error prediction approach for real-time inflow forecasting

2013 ◽  
Vol 45 (4-5) ◽  
pp. 589-602 ◽  
Author(s):  
Mahmood Akbari ◽  
Abbas Afshar

Regardless of extensive researches on hydrologic forecasting models, the issue of updating the outputs from forecasting models has remained a main challenge. Most of the existing output updating methods are mainly based on the presence of persistence in the errors. This paper presents an alternative approach to updating the outputs from forecasting models in order to produce more accurate forecast results. The approach uses the concept of the similarity in errors for error prediction. The K nearest neighbor (KNN) algorithm is employed as a similarity-based error prediction model and improvements are made by new data, and two other forms of the KNN are developed in this study. The KNN models are applied for the error prediction of flow forecasting models in two catchments and the updated flows are compared to those of persistence-based methods such as autoregressive (AR) and artificial neural network (ANN) models. The results show that the similarity-based error prediction models can be recognized as an efficient alternative for real-time inflow forecasting, especially where the persistence in the error series of flow forecasting model is relatively low.

Mathematics ◽  
2021 ◽  
Vol 9 (8) ◽  
pp. 830
Author(s):  
Seokho Kang

k-nearest neighbor (kNN) is a widely used learning algorithm for supervised learning tasks. In practice, the main challenge when using kNN is its high sensitivity to its hyperparameter setting, including the number of nearest neighbors k, the distance function, and the weighting function. To improve the robustness to hyperparameters, this study presents a novel kNN learning method based on a graph neural network, named kNNGNN. Given training data, the method learns a task-specific kNN rule in an end-to-end fashion by means of a graph neural network that takes the kNN graph of an instance to predict the label of the instance. The distance and weighting functions are implicitly embedded within the graph neural network. For a query instance, the prediction is obtained by performing a kNN search from the training data to create a kNN graph and passing it through the graph neural network. The effectiveness of the proposed method is demonstrated using various benchmark datasets for classification and regression tasks.


Author(s):  
Piotr Szczuko ◽  
Adam Kurowski ◽  
Piotr Odya ◽  
Andrzej Czyżewski ◽  
Bożena Kostek ◽  
...  

AbstractThe described application of granular computing is motivated because cardiovascular disease (CVD) remains a major killer globally. There is increasing evidence that abnormal respiratory patterns might contribute to the development and progression of CVD. Consequently, a method that would support a physician in respiratory pattern evaluation should be developed. Group decision-making, tri-way reasoning, and rough set–based analysis were applied to granular computing. Signal attributes and anthropomorphic parameters were explored to develop prediction models to determine the percentage contribution of periodic-like, intermediate, and normal breathing patterns in the analyzed signals. The proposed methodology was validated employing k-nearest neighbor (k-NN) and UMAP (uniform manifold approximation and projection). The presented approach applied to respiratory pattern evaluation shows that median accuracies in a considerable number of cases exceeded 0.75. Overall, parameters related to signal analysis are indicated as more important than anthropomorphic features. It was also found that obesity characterized by a high WHR (waist-to-hip ratio) and male sex were predisposing factors for the occurrence of periodic-like or intermediate patterns of respiration. It may be among the essential findings derived from this study. Based on classification measures, it may be observed that a physician may use such a methodology as a respiratory pattern evaluation-aided method.


2019 ◽  
Vol 13 (1) ◽  
pp. 141-150
Author(s):  
Jinhwan Jang

Background: Real-time Travel Time (TT) information has become an essential component of daily life in modern society. With reliable TT information, road users can increase their productivity by choosing less congested routes or adjusting their trip schedules. Drivers normally prefer departure time-based TT, but most agencies in Korea still provide arrival time-based TT with probe data from Dedicated Short-Range Communications (DSRC) scanners due to a lack of robust prediction techniques. Recently, interest has focused on the conventional k-nearest neighbor (k-NN) method that uses the Euclidean distance for real-time TT prediction. However, conventional k-NN still shows some deficiencies under certain conditions. Methods: This article identifies the cases where conventional k-NN has shortcomings and proposes an improved k-NN method that employs a correlation coefficient as a measure of distance and applies a regression equation to compensate for the difference between current and historical TT. Results: The superiority of the suggested method over conventional k-NN was verified using DSRC probe data gathered on a signalized suburban arterial in Korea, resulting in a decrease in TT prediction error of 3.7 percent points on average. Performance during transition periods where TTs are falling immediately after rising exhibited statistically significant differences by paired t-tests at a significance level of 0.05, yielding p-values of 0.03 and 0.003 for two-day data. Conclusion: The method presented in this study can enhance the accuracy of real-time TT information and consequently improve the productivity of road users.


2020 ◽  
Author(s):  
Hamza Turabieh ◽  
Alaa Sheta ◽  
Malik Braik ◽  
Elvira Kovač-Andrić

To fulfill the national air quality standards, many countries have created emissions monitoring strategies on air quality. Nowadays, policymakers and air quality executives depend on scientific computation and prediction models to monitor that cause air pollution, especially in industrial cities. Air pollution is considered one of the primary problems that could cause many human health problems such as asthma, damage to lungs, and even death. In this study, we present investigated development forecasting models for air pollutant attributes including Particulate Matters (PM2.5, PM10), ground-level Ozone (O3), and Nitrogen Oxides (NO2). The dataset used was collected from Dubrovnik city, which is located in the east of Croatia. The collected data has missing values. Therefore, we suggested the use of a Layered Recurrent Neural Network (L-RNN) to impute the missing value(s) of air pollutant attributes then build forecasting models. We adopted four regression models to forecast air pollutant attributes, which are: Multiple Linear Regression (MLR), Decision Tree Regression (DTR), Artificial Neural Network (ANN) and L-RNN. The obtained results show that the proposed method enhances the overall performance of other forecasting models.


2020 ◽  
Author(s):  
Nazrul Anuar Nayan ◽  
Hafifah Ab Hamid ◽  
Mohd Zubir Suboh ◽  
Noraidatulakma Abdullah ◽  
Rosmina Jaafar ◽  
...  

Abstract Background: Cardiovascular disease (CVD) is the leading cause of deaths worldwide. In 2017, CVD contributed to 13,503 deaths in Malaysia. The current approaches for CVD prediction are usually invasive and costly. Machine learning (ML) techniques allow an accurate prediction by utilizing the complex interactions among relevant risk factors. Results: This study presents a case–control study involving 60 participants from The Malaysian Cohort, which is a prospective population-based project. Five parameters, namely, the R–R interval and root mean square of successive differences extracted from electrocardiogram (ECG), systolic and diastolic blood pressures, and total cholesterol level, were statistically significant in predicting CVD. Six ML algorithms, namely, linear discriminant analysis, linear and quadratic support vector machines, decision tree, k-nearest neighbor, and artificial neural network (ANN), were evaluated to determine the most accurate classifier in predicting CVD risk. ANN, which achieved 90% specificity, 90% sensitivity, and 90% accuracy, demonstrated the highest prediction performance among the six algorithms. Conclusions: In summary, by utilizing ML techniques, ECG data can serve as a good parameter for CVD prediction among the Malaysian multiethnic population.


Water ◽  
2020 ◽  
Vol 12 (2) ◽  
pp. 440 ◽  
Author(s):  
Moyang Liu ◽  
Yingchun Huang ◽  
Zhijia Li ◽  
Bingxing Tong ◽  
Zhentao Liu ◽  
...  

Flow forecasting is an essential topic for flood prevention and mitigation. This study utilizes a data-driven approach, the Long Short-Term Memory neural network (LSTM), to simulate rainfall–runoff relationships for catchments with different climate conditions. The LSTM method presented was tested in three catchments with distinct climate zones in China. The recurrent neural network (RNN) was adopted for comparison to verify the superiority of the LSTM model in terms of time series prediction problems. The results of LSTM were also compared with a widely used process-based model, the Xinanjiang model (XAJ), as a benchmark to test the applicability of this novel method. The results suggest that LSTM could provide comparable quality predictions as the XAJ model and can be considered an efficient hydrology modeling approach. A real-time forecasting approach coupled with the k-nearest neighbor (KNN) algorithm as an updating method was proposed in this study to generalize the plausibility of the LSTM method for flood forecasting in a decision support system. We compared the simulation results of the LSTM and the LSTM-KNN model, which demonstrated the effectiveness of the LSTM-KNN model in the study areas and underscored the potential of the proposed model for real-time flood forecasting.


2019 ◽  
Vol 9 (20) ◽  
pp. 4448 ◽  
Author(s):  
İş ◽  
Tuncer

This article considers methodological approaches to determine and prevent social media manipulation specific to Twitter. Behavioral analyses of Twitter users were performed by using their profile structures and interaction types, and Twitter users were classified according to their effect size values by determining their asset values. User profiles were classified into three different categories, namely popular-active, observer-passive, and spam-bot-malicious by using k-nearest neighbor (K-NN), support vector machine (SVM), and artificial neural network (ANN) algorithms. For classification, the study used the basic characteristics of users, such as density, centralization, and diameter, as well as suggested time series such as the simple moving average and cumulative moving average. The highest accuracy was obtained by the K-NN algorithm. The results obtained with K-NN for all classes were higher than the F1-Score values obtained for the other algorithms. According to the results obtained, classification accuracy values were found to reach a maximum of 96.81% and a minimum of 92.33%. Our classification results showed that the proposed method was satisfactory for popular-active, observer-passive, and spam-bot-malicious account separation.


Author(s):  
Nayan Nazrul Anuar ◽  
Ab Hamid Hafifah ◽  
Suboh Mohd Zubir ◽  
Abdullah Noraidatulakma ◽  
Jaafar Rosmina ◽  
...  

<p>Cardiovascular disease (CVD) is the leading cause of deaths worldwide. In 2017, CVD contributed to 13,503 deaths in Malaysia. The current approaches for CVD prediction are usually invasive and costly. Machine learning (ML) techniques allow an accurate prediction by utilizing the complex interactions among relevant risk factors. This study presents a case–control study involving 60 participants from The Malaysian Cohort, which is a prospective population-based project. Five parameters, namely, the R–R interval and root mean square of successive differences extracted from electrocardiogram (ECG), systolic and diastolic blood pressures, and total cholesterol level, were statistically significant in predicting CVD. Six ML algorithms, namely, linear discriminant analysis, linear and quadratic support vector machines, decision tree, k-nearest neighbor, and artificial neural network (ANN), were evaluated to determine the most accurate classifier in predicting CVD risk. ANN, which achieved 90% specificity, 90% sensitivity, and 90% accuracy, demonstrated the highest prediction performance among the six algorithms. In summary, by utilizing ML techniques, ECG data can serve as a good parameter for CVD prediction among the Malaysian multiethnic population.</p>


Sign in / Sign up

Export Citation Format

Share Document