A Data-Driven Two-Stage Prediction Model for Train Primary-Delay Recovery Time

Author(s):  
Bowen Gao ◽  
Dongxiu Ou ◽  
Decun Dong ◽  
Yusen Wu

Accurate prediction of train delay recovery is critical for railway incident management and providing passengers with accurate journey time. In this paper, a two-stage prediction model is proposed to predict the recovery time of train primary-delay based on the real records from High-Speed Railway (HSR). In Stage 1, two models are built to study the influence of feature space and model framework on the prediction accuracy of buffer time in each section or station. It is found that explicitly inputting the attribute features of stations and sections to the model, instead of implicit simulation, will improve the prediction accuracy effectively. For validation purpose, the proposed model has been compared with several alternative models, namely, Logistic Regression (LR), Artificial Neutral Network (ANN), Support Vector Machine (SVM) and Gradient Boosting Tree (GBT). The results show that its remarkable performance is better than other schemes. Specifically, when the error is extended to 3[Formula: see text]min, the proposed model can achieve up to the accuracy of 94.63%. It proves that our method has high value in practical engineering application. Considering the delay propagation of trains is a complex process, our future study will focus on building delay propagation knowledge base and dispatcher experience knowledge base.

2021 ◽  
Vol 16 (3) ◽  
pp. 285-296
Author(s):  
Y.D. Zhang ◽  
L. Liao ◽  
Q. Yu ◽  
W.G. Ma ◽  
K.H. Li

Accurate prediction of train delay is an important basis for the intelligent adjustment of train operation plans. This paper proposes a train delay prediction model that considers the delay propagation feature. The model consists of two parts. The first part is the extraction of delay propagation feature. The best delay classification scheme is determined through the clustering method of delay types for historical data based on the density-based spatial clustering of applications with noise algorithm (DBSCAN), and combining the best delay classification scheme and the k-nearest neighbor (KNN) algorithm to design the classification method of delay type for online data. The delay propagation factor is used to quantify the delay propagation relationship, and on this basis, the horizontal and vertical delay propagation feature are constructed. The second part is the delay prediction, which takes the train operation status feature and delay propagation feature as input feature, and use the gradient boosting decision tree (GBDT) algorithm to complete the prediction. The model was tested and simulated using the actual train operation data, and compared with random forest (RF), support vector regression (SVR) and multilayer perceptron (MLP). The results show that considering the delay propagation feature in the train delay prediction model can further improve the accuracy of train delay prediction. The delay prediction model proposed in this paper can provide a theoretical basis for the intelligentization of railway dispatching, enabling dispatchers to control delays more reasonably, and improve the quality of railway transportation services.


Author(s):  
K U Jaseena ◽  
Binsu C Kovoor

Accurate weather prediction is always a challenge for meteorologists. This paper suggests a Deep Neural Network (DNN) model to predict minimum and maximum values of temperature based on various weather parameters such as humidity, dew point, and wind speed. Particle Swarm Optimisation (PSO) algorithm is applied to select relevant and important features of the datasets to improve the prediction accuracy of the model. The grid search algorithm is employed to determine the hyperparameters of the proposed DNN model. The statistical indicators Mean Square Error, Mean Absolute Error, Mean Absolute Percentage Error, Nash–Sutcliffe model efficiency coefficient, and Correlation Coefficient are used to evaluate the accuracy of the prediction model. Performance comparison of the proposed model is performed with the Support Vector Machine (SVM) and Vector Autoregression (VAR) models. The experimental outcomes show that the proposed model optimised using PSO achieves better prediction accuracy than traditional approaches.


2018 ◽  
Vol 11 (1) ◽  
pp. 64 ◽  
Author(s):  
Kyoung-jae Kim ◽  
Kichun Lee ◽  
Hyunchul Ahn

Measuring and managing the financial sustainability of the borrowers is crucial to financial institutions for their risk management. As a result, building an effective corporate financial distress prediction model has been an important research topic for a long time. Recently, researchers are exerting themselves to improve the accuracy of financial distress prediction models by applying various business analytics approaches including statistical and artificial intelligence methods. Among them, support vector machines (SVMs) are becoming popular. SVMs require only small training samples and have little possibility of overfitting if model parameters are properly tuned. Nonetheless, SVMs generally show high prediction accuracy since it can deal with complex nonlinear patterns. Despite of these advantages, SVMs are often criticized because their architectural factors are determined by heuristics, such as the parameters of a kernel function and the subsets of appropriate features and instances. In this study, we propose globally optimized SVMs, denoted by GOSVM, a novel hybrid SVM model designed to optimize feature selection, instance selection, and kernel parameters altogether. This study introduces genetic algorithm (GA) in order to simultaneously optimize multiple heterogeneous design factors of SVMs. Our study applies the proposed model to the real-world case for predicting financial distress. Experiments show that the proposed model significantly improves the prediction accuracy of conventional SVMs.


2014 ◽  
Vol 610 ◽  
pp. 789-796
Author(s):  
Jiang Bao Li ◽  
Zhen Hong Jia ◽  
Xi Zhong Qin ◽  
Lei Sheng ◽  
Li Chen

In order to improve the prediction accuracy of busy telephone traffic, this study proposes a busy telephone traffic prediction method that combines wavelet transformation and least square support vector machine (lssvm) model which is optimized by particle swarm optimization (pso) algorithm. Firstly, decompose the pretreatment of busy telephone traffic data with mallat algorithm and get low frequency component and high frequency component. Secondly, reconfigure each component and use pso_lssvm model predict each reconfigured one. Then the busy telephone traffic can be achieved. The experimental results show that the prediction model has higher prediction accuracy and stability.


Healthcare ◽  
2021 ◽  
Vol 9 (10) ◽  
pp. 1334
Author(s):  
Hasan Symum ◽  
José Zayas-Castro

The timing of 30-day pediatric readmissions is skewed with approximately 40% of the incidents occurring within the first week of hospital discharges. The skewed readmission time distribution coupled with delay in health information exchange among healthcare providers might offer a limited time to devise a comprehensive intervention plan. However, pediatric readmission studies are thus far limited to the development of the prediction model after hospital discharges. In this study, we proposed a novel pediatric readmission prediction model at the time of hospital admission which can improve the high-risk patient selection process. We also compared proposed models with the standard at-discharge readmission prediction model. Using the Hospital Cost and Utilization Project database, this prognostic study included pediatric hospital discharges in Florida from January 2016 through September 2017. Four machine learning algorithms—logistic regression with backward stepwise selection, decision tree, Support Vector machines (SVM) with the polynomial kernel, and Gradient Boosting—were developed for at-admission and at-discharge models using a recursive feature elimination technique with a repeated cross-validation process. The performance of the at-admission and at-discharge model was measured by the area under the curve. The performance of the at-admission model was comparable with the at-discharge model for all four algorithms. SVM with Polynomial Kernel algorithms outperformed all other algorithms for at-admission and at-discharge models. Important features associated with increased readmission risk varied widely across the type of prediction model and were mostly related to patients’ demographics, social determinates, clinical factors, and hospital characteristics. Proposed at-admission readmission risk decision support model could help hospitals and providers with additional time for intervention planning, particularly for those targeting social determinants of children’s overall health.


2021 ◽  
pp. 0309524X2110568
Author(s):  
Lian Lian ◽  
Kan He

The accuracy of wind power prediction directly affects the operation cost of power grid and is the result of power grid supply and demand balance. Therefore, how to improve the prediction accuracy of wind power is very important. In order to improve the prediction accuracy of wind power, a prediction model based on wavelet denoising and improved slime mold algorithm optimized support vector machine is proposed. The wavelet denoising algorithm is used to denoise the wind power data, and then the support vector machine is used as the prediction model. Because the prediction results of support vector machine are greatly affected by model parameters, an improved slime mold optimization algorithm with random inertia weight mechanism is used to determine the best penalty factor and kernel function parameters in support vector machine model. The effectiveness of the proposed prediction model is verified by using two groups actually collected wind power data. Seven prediction models are selected as the comparison model. Through the comparison between the predicted value and the actual value, the prediction error and its histogram distribution, the performance indicators, the Pearson’s correlation coefficient, the DM test, box-plot distribution, the results show that the proposed prediction model has high prediction accuracy.


Symmetry ◽  
2020 ◽  
Vol 12 (3) ◽  
pp. 454 ◽  
Author(s):  
Jung-Hyok Kwon ◽  
Eui-Jik Kim

This paper presents a failure prediction model using iterative feature selection, which aims to accurately predict the failure occurrences in industrial Internet of Things (IIoT) environments. In general, vast amounts of data are collected from various sensors in an IIoT environment, and they are analyzed to prevent failures by predicting their occurrence. However, the collected data may include data irrelevant to failures and thereby decrease the prediction accuracy. To address this problem, we propose a failure prediction model using iterative feature selection. To build the model, the relevancy between each feature (i.e., each sensor) and the failure was analyzed using the random forest algorithm, to obtain the importance of the features. Then, feature selection and model building were conducted iteratively. In each iteration, a new feature was selected considering the importance and added to the selected feature set. The failure prediction model was built for each iteration via the support vector machine (SVM). Finally, the failure prediction model having the highest prediction accuracy was selected. The experimental implementation was conducted using open-source R. The results showed that the proposed failure prediction model achieved high prediction accuracy.


2013 ◽  
Vol 300-301 ◽  
pp. 189-194 ◽  
Author(s):  
Yu Sun ◽  
Ling Ling Li ◽  
Xiao Song Huang ◽  
Chao Ying Duan

To avoid the impact which is caused by the characteristics of the random fluctuations of the wind speed to grid-connected wind power generation system, accurately prediction of short-term wind speed is needed. This paper designed a combination prediction model which used the theories of wavelet transformation and support vector machine (SVM). This improved the model’s prediction accuracy through the method of achiving change character in wind speed sequences in different scales by wavelet transform and optimizing the parameters of support vector machines through the improved particle swarm algorithm. The model showed great generalization ability and high prediction accuracy through the experiment. The lowest root-mean-square error of 200 samples was up to 0.0932 and the model’s effect was much stronger than the BP neural network prediction model. It provided an effective method for predicting wind speed.


2016 ◽  
Vol 17 (1) ◽  
pp. 52-60 ◽  
Author(s):  
Yihui Fang ◽  
Xingwei Chen ◽  
Nian-Sheng Cheng

Estuary salinity predictions can help to improve water safety in coastal areas. Coupled genetic algorithm-support vector machine (GA-SVM) models, which adopt a GA to optimize the SVM parameters, have been successfully applied in some research fields. In light of previous research findings, an application of a GA-SVM model for tidal estuary salinity prediction is proposed in this paper. The corresponding model is developed to predict the salinity of the Min River Estuary (MRE). By conducting an analysis of the time series of daily salinity and the results of simulation experiments, the high-tide level, runoff and previous salinity are considered as the major factors that influence salinity variation. The prediction accuracy of the GA-SVM model is satisfactory, with coefficient of determination (R2) of 0.85, Nash–Sutcliffe efficiency of 0.84 and root mean square error of 119 (μS/cm). The proposed model performs significantly better than the traditional SVM model in terms of prediction accuracy and computing time. It can be concluded that the proposed model can successfully predict the salinity of MRE based on the high-tide level, runoff and previous salinity.


Sign in / Sign up

Export Citation Format

Share Document