Enhancing Crash Injury Severity Prediction on Imbalanced Crash Data by Sampling Technique with Variable Selection

Author(s):  
Mahama Yahaya ◽  
Xinguo Jiang ◽  
Chuanyun Fu ◽  
Kamal Bashir ◽  
Wenbo Fan
Author(s):  
Mohammad Razaur Rahman Shaon ◽  
Xiao Qin

Unsafe driving behaviors, driver limitations, and conditions that lead to a crash are usually referred to as driver errors. Even though driver errors are widely cited as a critical reason for crash occurrence in crash reports and safety literature, the discussion on their consequences is limited. This study aims to quantify the effect of driver errors on crash injury severity. To assist this investigation, driver errors were categorized as sequential events in a driving task. Possible combinations of driver error categories were created and ranked based on statistical dependences between error combinations and injury severity levels. Binary logit models were then developed to show that typical variables used to model injury severity such as driver characteristics, roadway characteristics, environmental factors, and crash characteristics are inadequate to explain driver errors, especially the complicated ones. Next, ordinal probit models were applied to quantify the effect of driver errors on injury severity for rural crashes. Superior model performance is observed when driver error combinations were modeled along with typical crash variables to predict the injury outcome. Modeling results also illustrate that more severe crashes tend to occur when the driver makes multiple mistakes. Therefore, incorporating driver errors in crash injury severity prediction not only improves prediction accuracy but also enhances our understanding of what error(s) may lead to more severe injuries so that safety interventions can be recommended accordingly.


Electronics ◽  
2018 ◽  
Vol 7 (12) ◽  
pp. 381 ◽  
Author(s):  
Yaping Liao ◽  
Junyou Zhang ◽  
Shufeng Wang ◽  
Sixian Li ◽  
Jian Han

Motor vehicle crashes remain a leading cause of life and property loss to society. Autonomous vehicles can mitigate the losses by making appropriate emergency decision, and the crash injury severity prediction model is the basis for autonomous vehicles to make decisions in emergency situations. In this paper, based on the support vector machine (SVM) model and NASS/GES crash data, three SVM crash injury severity prediction models (B-SVM, T-SVM, and BT-SVM) corresponding to braking, turning, and braking + turning respectively are established. The vehicle relative speed (REL_SPEED) and the gross vehicle weight rating (GVWR) are introduced into the impact indicators of the prediction models. Secondly, the ordered logit (OL) and back propagation neural network (BPNN) models are established to validate the accuracy of the SVM models. The results show that the SVM models have the best performance than the other two. Next, the impact of REL_SPEED and GVWR on injury severity is analyzed quantitatively by the sensitivity analysis, the results demonstrate that the increase of REL_SPEED and GVWR will make vehicle crash more serious. Finally, the same crash samples under normal road and environmental conditions are input into B-SVM, T-SVM, and BT-SVM respectively, the output results are compared and analyzed. The results show that with other conditions being the same, as the REL_SPEED increased from the low (0–20 mph) to middle (20–45 mph) and then to the high range (45–75 mph), the best emergency decision with the minimum crash injury severity will gradually transition from braking to turning and then to braking + turning.


Author(s):  
Khaled Assi ◽  
Syed Masiur Rahman ◽  
Umer Mansoor ◽  
Nedal Ratrout

Predicting crash injury severity is a crucial constituent of reducing the consequences of traffic crashes. This study developed machine learning (ML) models to predict crash injury severity using 15 crash-related parameters. Separate ML models for each cluster were obtained using fuzzy c-means, which enhanced the predicting capability. Finally, four ML models were developed: feed-forward neural networks (FNN), support vector machine (SVM), fuzzy C-means clustering based feed-forward neural network (FNN-FCM), and fuzzy c-means based support vector machine (SVM-FCM). Features that were easily identified with little investigation on crash sites were used as an input so that the trauma center can predict the crash severity level based on the initial information provided from the crash site and prepare accordingly for the treatment of the victims. The input parameters mainly include vehicle attributes and road condition attributes. This study used the crash database of Great Britain for the years 2011–2016. A random sample of crashes representing each year was used considering the same share of severe and non-severe crashes. The models were compared based on injury severity prediction accuracy, sensitivity, precision, and harmonic mean of sensitivity and precision (i.e., F1 score). The SVM-FCM model outperformed the other developed models in terms of accuracy and F1 score in predicting the injury severity level of severe and non-severe crashes. This study concluded that the FCM clustering algorithm enhanced the prediction power of FNN and SVM models.


Author(s):  
Shengxue Zhu ◽  
Ke Wang ◽  
Chongyi Li

In many related works, nominal classification algorithms ignore the order between injury severity levels and make sub-optimal predictions. Existing ordinal classification methods suffer rank inconsistency and rank non-monotonicity. The aim of this paper is to propose an ordinal classification approach to predict traffic crash injury severity and to test its performance over existing machine learning classification methods. First, we compare the performance of the neural network, XGBoost, and SVM classifiers in injury severity prediction. Second, we utilize a severity category-combination method with oversampling to relieve the class-imbalance problem prevalent in crash data. Third, we take advantage of probability calibration and the optimal probability threshold moving to improve the prediction ability of ordinal classification. The proposed approach can satisfy the rank consistency and rank monotonicity requirement and is proved to be superior to other ordinal classification methods and nominal classification machine learning by statistical significance test. Important factors relating to injury severity are selected based on their permutation feature importance scores. We find that converting severity levels into three classes, minor injury, moderate injury, and serious injury, can substantially improve the prediction precision.


Author(s):  
Arshad Jamal ◽  
Waleed Umer

A better understanding of circumstances contributing to the severity outcome of traffic crashes is an important goal of road safety studies. An in-depth crash injury severity analysis is vital for the proactive implementation of appropriate mitigation strategies. This study proposes an improved feed-forward neural network (FFNN) model for predicting injury severity associated with individual crashes using three years (2017–2019) of crash data collected along 15 rural highways in the Kingdom of Saudi Arabia (KSA). A total of 12,566 crashes were recorded during the study period with a binary injury severity outcome (fatal or non-fatal injury) for the variable to be predicted. FFNN architecture with back-propagation (BP) as a training algorithm, logistic as activation function, and six number of hidden neurons in the hidden layer yielded the best model performance. Results of model prediction for the test data were analyzed using different evaluation metrics such as overall accuracy, sensitivity, and specificity. Prediction results showed the adequacy and robust performance of the proposed method. A detailed sensitivity analysis of the optimized NN was also performed to show the impact and relative influence of different predictor variables on resulting crash injury severity. The sensitivity analysis results indicated that factors such as traffic volume, average travel speeds, weather conditions, on-site damage conditions, road and vehicle type, and involvement of pedestrians are the most sensitive variables. The methods applied in this study could be used in big data analysis of crash data, which can serve as a rapid-useful tool for policymakers to improve highway safety.


Author(s):  
Yashu Kang ◽  
Aemal Khattak

The presence of unobserved heterogeneity in crash data can result in estimation of biased model parameters and incorrect inferences. The research presented in this paper investigated severity of crashes reported at highway–rail grade crossings by appropriately clustering the data, accounting for unobserved heterogeneity. A combination of data mining and statistical regression methods was used to cluster crash data into subsets and then to identify factors associated with crash injury severity levels. This research relied on highway–rail accident, incident, and crossing inventory databases for 2011 to 2015 obtained from FRA. Three clustering methods— K-means, traditional latent class cluster, and variational Bayesian latent class cluster—were considered, and the variational Bayesian latent class cluster method was chosen for partitioning the data set for model estimation. Unclustered data as well as the clustered subsets were used to estimate ordered logit models for crash injury severity. A comparison revealed that the cluster-based approach provided more relevant model parameters and identified factors relevant only to certain clusters of the data.


Sign in / Sign up

Export Citation Format

Share Document