A Data-Driven Two-Stage Prediction Model for Train Primary-Delay Recovery Time

Accurate prediction of train delay recovery is critical for railway incident management and providing passengers with accurate journey time. In this paper, a two-stage prediction model is proposed to predict the recovery time of train primary-delay based on the real records from High-Speed Railway (HSR). In Stage 1, two models are built to study the influence of feature space and model framework on the prediction accuracy of buffer time in each section or station. It is found that explicitly inputting the attribute features of stations and sections to the model, instead of implicit simulation, will improve the prediction accuracy effectively. For validation purpose, the proposed model has been compared with several alternative models, namely, Logistic Regression (LR), Artificial Neutral Network (ANN), Support Vector Machine (SVM) and Gradient Boosting Tree (GBT). The results show that its remarkable performance is better than other schemes. Specifically, when the error is extended to 3[Formula: see text]min, the proposed model can achieve up to the accuracy of 94.63%. It proves that our method has high value in practical engineering application. Considering the delay propagation of trains is a complex process, our future study will focus on building delay propagation knowledge base and dispatcher experience knowledge base.

Download Full-text

Using the gradient boosting decision tree (GBDT) algorithm for a train delay prediction model considering the delay propagation feature

Advances in Production Engineering & Management ◽

10.14743/apem2021.3.400 ◽

2021 ◽

Vol 16 (3) ◽

pp. 285-296

Author(s):

Y.D. Zhang ◽

L. Liao ◽

Q. Yu ◽

W.G. Ma ◽

K.H. Li

Keyword(s):

Decision Tree ◽

Prediction Model ◽

Classification Scheme ◽

Gradient Boosting ◽

Support Vector ◽

K Nearest Neighbor ◽

Delay Propagation ◽

Train Operation ◽

Delay Prediction ◽

Propagation Feature

Accurate prediction of train delay is an important basis for the intelligent adjustment of train operation plans. This paper proposes a train delay prediction model that considers the delay propagation feature. The model consists of two parts. The first part is the extraction of delay propagation feature. The best delay classification scheme is determined through the clustering method of delay types for historical data based on the density-based spatial clustering of applications with noise algorithm (DBSCAN), and combining the best delay classification scheme and the k-nearest neighbor (KNN) algorithm to design the classification method of delay type for online data. The delay propagation factor is used to quantify the delay propagation relationship, and on this basis, the horizontal and vertical delay propagation feature are constructed. The second part is the delay prediction, which takes the train operation status feature and delay propagation feature as input feature, and use the gradient boosting decision tree (GBDT) algorithm to complete the prediction. The model was tested and simulated using the actual train operation data, and compared with random forest (RF), support vector regression (SVR) and multilayer perceptron (MLP). The results show that considering the delay propagation feature in the train delay prediction model can further improve the accuracy of train delay prediction. The delay prediction model proposed in this paper can provide a theoretical basis for the intelligentization of railway dispatching, enabling dispatchers to control delays more reasonably, and improve the quality of railway transportation services.

Download Full-text

An Improved Multivariate Weather Prediction Model Using Deep Neural Networks and Particle Swarm Optimisation

Journal of Information & Knowledge Management ◽

10.1142/s0219649221500295 ◽

2021 ◽

pp. 2150029

Author(s):

K U Jaseena ◽

Binsu C Kovoor

Keyword(s):

Prediction Model ◽

Prediction Accuracy ◽

Weather Prediction ◽

Particle Swarm ◽

Particle Swarm Optimisation ◽

Dew Point ◽

Percentage Error ◽

Support Vector ◽

Efficiency Coefficient ◽

Proposed Model

Accurate weather prediction is always a challenge for meteorologists. This paper suggests a Deep Neural Network (DNN) model to predict minimum and maximum values of temperature based on various weather parameters such as humidity, dew point, and wind speed. Particle Swarm Optimisation (PSO) algorithm is applied to select relevant and important features of the datasets to improve the prediction accuracy of the model. The grid search algorithm is employed to determine the hyperparameters of the proposed DNN model. The statistical indicators Mean Square Error, Mean Absolute Error, Mean Absolute Percentage Error, Nash–Sutcliffe model efficiency coefficient, and Correlation Coefficient are used to evaluate the accuracy of the prediction model. Performance comparison of the proposed model is performed with the Support Vector Machine (SVM) and Vector Autoregression (VAR) models. The experimental outcomes show that the proposed model optimised using PSO achieves better prediction accuracy than traditional approaches.

Download Full-text

A novel two-stage hybrid default prediction model with k-means clustering and support vector domain description

Research in International Business and Finance ◽

10.1016/j.ribaf.2021.101536 ◽

2022 ◽

Vol 59 ◽

pp. 101536

Author(s):

Kunpeng Yuan ◽

Guotai Chi ◽

Ying Zhou ◽

Hailei Yin

Keyword(s):

Prediction Model ◽

Support Vector ◽

Two Stage ◽

Default Prediction ◽

Domain Description

Download Full-text

Predicting Corporate Financial Sustainability Using Novel Business Analytics

Sustainability ◽

10.3390/su11010064 ◽

2018 ◽

Vol 11 (1) ◽

pp. 64 ◽

Cited By ~ 5

Author(s):

Kyoung-jae Kim ◽

Kichun Lee ◽

Hyunchul Ahn

Keyword(s):

Financial Distress ◽

Prediction Accuracy ◽

Prediction Models ◽

Support Vector ◽

Model Parameters ◽

Financial Sustainability ◽

Business Analytics ◽

Financial Distress Prediction ◽

Proposed Model ◽

Distress Prediction

Measuring and managing the financial sustainability of the borrowers is crucial to financial institutions for their risk management. As a result, building an effective corporate financial distress prediction model has been an important research topic for a long time. Recently, researchers are exerting themselves to improve the accuracy of financial distress prediction models by applying various business analytics approaches including statistical and artificial intelligence methods. Among them, support vector machines (SVMs) are becoming popular. SVMs require only small training samples and have little possibility of overfitting if model parameters are properly tuned. Nonetheless, SVMs generally show high prediction accuracy since it can deal with complex nonlinear patterns. Despite of these advantages, SVMs are often criticized because their architectural factors are determined by heuristics, such as the parameters of a kernel function and the subsets of appropriate features and instances. In this study, we propose globally optimized SVMs, denoted by GOSVM, a novel hybrid SVM model designed to optimize feature selection, instance selection, and kernel parameters altogether. This study introduces genetic algorithm (GA) in order to simultaneously optimize multiple heterogeneous design factors of SVMs. Our study applies the proposed model to the real-world case for predicting financial distress. Experiments show that the proposed model significantly improves the prediction accuracy of conventional SVMs.

Download Full-text

Research on Combined Prediction Model for Busy Telephone Traffic

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.610.789 ◽

2014 ◽

Vol 610 ◽

pp. 789-796

Author(s):

Jiang Bao Li ◽

Zhen Hong Jia ◽

Xi Zhong Qin ◽

Lei Sheng ◽

Li Chen

Keyword(s):

Prediction Model ◽

Prediction Accuracy ◽

Low Frequency ◽

Prediction Method ◽

Pso Algorithm ◽

Frequency Component ◽

Least Square ◽

High Frequency Component ◽

Support Vector ◽

Telephone Traffic

In order to improve the prediction accuracy of busy telephone traffic, this study proposes a busy telephone traffic prediction method that combines wavelet transformation and least square support vector machine (lssvm) model which is optimized by particle swarm optimization (pso) algorithm. Firstly, decompose the pretreatment of busy telephone traffic data with mallat algorithm and get low frequency component and high frequency component. Secondly, reconfigure each component and use pso_lssvm model predict each reconfigured one. Then the busy telephone traffic can be achieved. The experimental results show that the prediction model has higher prediction accuracy and stability.

Download Full-text

Identifying Children at Readmission Risk: At-Admission Versus Traditional At-Discharge Readmission Prediction Model

Healthcare ◽

10.3390/healthcare9101334 ◽

2021 ◽

Vol 9 (10) ◽

pp. 1334

Author(s):

Hasan Symum ◽

José Zayas-Castro

Keyword(s):

Prediction Model ◽

Information Exchange ◽

High Risk Patient ◽

Machine Learning Algorithms ◽

Polynomial Kernel ◽

Gradient Boosting ◽

Support Vector ◽

Hospital Discharges ◽

Discharge Model ◽

Readmission Risk

The timing of 30-day pediatric readmissions is skewed with approximately 40% of the incidents occurring within the first week of hospital discharges. The skewed readmission time distribution coupled with delay in health information exchange among healthcare providers might offer a limited time to devise a comprehensive intervention plan. However, pediatric readmission studies are thus far limited to the development of the prediction model after hospital discharges. In this study, we proposed a novel pediatric readmission prediction model at the time of hospital admission which can improve the high-risk patient selection process. We also compared proposed models with the standard at-discharge readmission prediction model. Using the Hospital Cost and Utilization Project database, this prognostic study included pediatric hospital discharges in Florida from January 2016 through September 2017. Four machine learning algorithms—logistic regression with backward stepwise selection, decision tree, Support Vector machines (SVM) with the polynomial kernel, and Gradient Boosting—were developed for at-admission and at-discharge models using a recursive feature elimination technique with a repeated cross-validation process. The performance of the at-admission and at-discharge model was measured by the area under the curve. The performance of the at-admission model was comparable with the at-discharge model for all four algorithms. SVM with Polynomial Kernel algorithms outperformed all other algorithms for at-admission and at-discharge models. Important features associated with increased readmission risk varied widely across the type of prediction model and were mostly related to patients’ demographics, social determinates, clinical factors, and hospital characteristics. Proposed at-admission readmission risk decision support model could help hospitals and providers with additional time for intervention planning, particularly for those targeting social determinants of children’s overall health.

Download Full-text

Wind power prediction based on wavelet denoising and improved slime mold algorithm optimized support vector machine

Wind Engineering ◽

10.1177/0309524x211056822 ◽

2021 ◽

pp. 0309524X2110568

Author(s):

Lian Lian ◽

Kan He

Keyword(s):

Support Vector Machine ◽

Prediction Model ◽

Wind Power ◽

Prediction Accuracy ◽

Power Grid ◽

Wavelet Denoising ◽

Slime Mold ◽

Support Vector ◽

Power Prediction ◽

Wind Power Prediction

The accuracy of wind power prediction directly affects the operation cost of power grid and is the result of power grid supply and demand balance. Therefore, how to improve the prediction accuracy of wind power is very important. In order to improve the prediction accuracy of wind power, a prediction model based on wavelet denoising and improved slime mold algorithm optimized support vector machine is proposed. The wavelet denoising algorithm is used to denoise the wind power data, and then the support vector machine is used as the prediction model. Because the prediction results of support vector machine are greatly affected by model parameters, an improved slime mold optimization algorithm with random inertia weight mechanism is used to determine the best penalty factor and kernel function parameters in support vector machine model. The effectiveness of the proposed prediction model is verified by using two groups actually collected wind power data. Seven prediction models are selected as the comparison model. Through the comparison between the predicted value and the actual value, the prediction error and its histogram distribution, the performance indicators, the Pearson’s correlation coefficient, the DM test, box-plot distribution, the results show that the proposed prediction model has high prediction accuracy.

Download Full-text

Failure Prediction Model Using Iterative Feature Selection for Industrial Internet of Things

Symmetry ◽

10.3390/sym12030454 ◽

2020 ◽

Vol 12 (3) ◽

pp. 454 ◽

Cited By ~ 1

Author(s):

Jung-Hyok Kwon ◽

Eui-Jik Kim

Keyword(s):

Feature Selection ◽

Internet Of Things ◽

Prediction Model ◽

Prediction Accuracy ◽

Model Building ◽

Failure Prediction ◽

Support Vector ◽

Industrial Internet Of Things ◽

Industrial Internet ◽

New Feature

This paper presents a failure prediction model using iterative feature selection, which aims to accurately predict the failure occurrences in industrial Internet of Things (IIoT) environments. In general, vast amounts of data are collected from various sensors in an IIoT environment, and they are analyzed to prevent failures by predicting their occurrence. However, the collected data may include data irrelevant to failures and thereby decrease the prediction accuracy. To address this problem, we propose a failure prediction model using iterative feature selection. To build the model, the relevancy between each feature (i.e., each sensor) and the failure was analyzed using the random forest algorithm, to obtain the importance of the features. Then, feature selection and model building were conducted iteratively. In each iteration, a new feature was selected considering the importance and added to the selected feature set. The failure prediction model was built for each iteration via the support vector machine (SVM). Finally, the failure prediction model having the highest prediction accuracy was selected. The experimental implementation was conducted using open-source R. The results showed that the proposed failure prediction model achieved high prediction accuracy.

Download Full-text

Short-Term Wind Speed Forecasting Based on Optimizated Support Vector Machine

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.300-301.189 ◽

2013 ◽

Vol 300-301 ◽

pp. 189-194 ◽

Cited By ~ 1

Author(s):

Yu Sun ◽

Ling Ling Li ◽

Xiao Song Huang ◽

Chao Ying Duan

Keyword(s):

Support Vector Machine ◽

Wind Speed ◽

Prediction Model ◽

Prediction Accuracy ◽

Support Vector ◽

Particle Swarm Algorithm ◽

Short Term ◽

Neural Network Prediction ◽

Random Fluctuations ◽

The Impact

To avoid the impact which is caused by the characteristics of the random fluctuations of the wind speed to grid-connected wind power generation system, accurately prediction of short-term wind speed is needed. This paper designed a combination prediction model which used the theories of wavelet transformation and support vector machine (SVM). This improved the model’s prediction accuracy through the method of achiving change character in wind speed sequences in different scales by wavelet transform and optimizing the parameters of support vector machines through the improved particle swarm algorithm. The model showed great generalization ability and high prediction accuracy through the experiment. The lowest root-mean-square error of 200 samples was up to 0.0932 and the model’s effect was much stronger than the BP neural network prediction model. It provided an effective method for predicting wind speed.

Download Full-text

Estuary salinity prediction using a coupled GA-SVM model: a case study of the Min River Estuary, China

Water Science & Technology Water Supply ◽

10.2166/ws.2016.097 ◽

2016 ◽

Vol 17 (1) ◽

pp. 52-60 ◽

Cited By ~ 1

Author(s):

Yihui Fang ◽

Xingwei Chen ◽

Nian-Sheng Cheng

Keyword(s):

Prediction Accuracy ◽

Computing Time ◽

High Tide ◽

River Estuary ◽

Support Vector ◽

Min River ◽

Proposed Model ◽

Svm Model ◽

Tide Level ◽

Min River Estuary

Estuary salinity predictions can help to improve water safety in coastal areas. Coupled genetic algorithm-support vector machine (GA-SVM) models, which adopt a GA to optimize the SVM parameters, have been successfully applied in some research fields. In light of previous research findings, an application of a GA-SVM model for tidal estuary salinity prediction is proposed in this paper. The corresponding model is developed to predict the salinity of the Min River Estuary (MRE). By conducting an analysis of the time series of daily salinity and the results of simulation experiments, the high-tide level, runoff and previous salinity are considered as the major factors that influence salinity variation. The prediction accuracy of the GA-SVM model is satisfactory, with coefficient of determination (R2) of 0.85, Nash–Sutcliffe efficiency of 0.84 and root mean square error of 119 (μS/cm). The proposed model performs significantly better than the traditional SVM model in terms of prediction accuracy and computing time. It can be concluded that the proposed model can successfully predict the salinity of MRE based on the high-tide level, runoff and previous salinity.

Download Full-text