Study on Crash Injury Severity Prediction of Autonomous Vehicles for Different Emergency Decisions Based on Support Vector Machine Model

Motor vehicle crashes remain a leading cause of life and property loss to society. Autonomous vehicles can mitigate the losses by making appropriate emergency decision, and the crash injury severity prediction model is the basis for autonomous vehicles to make decisions in emergency situations. In this paper, based on the support vector machine (SVM) model and NASS/GES crash data, three SVM crash injury severity prediction models (B-SVM, T-SVM, and BT-SVM) corresponding to braking, turning, and braking + turning respectively are established. The vehicle relative speed (REL_SPEED) and the gross vehicle weight rating (GVWR) are introduced into the impact indicators of the prediction models. Secondly, the ordered logit (OL) and back propagation neural network (BPNN) models are established to validate the accuracy of the SVM models. The results show that the SVM models have the best performance than the other two. Next, the impact of REL_SPEED and GVWR on injury severity is analyzed quantitatively by the sensitivity analysis, the results demonstrate that the increase of REL_SPEED and GVWR will make vehicle crash more serious. Finally, the same crash samples under normal road and environmental conditions are input into B-SVM, T-SVM, and BT-SVM respectively, the output results are compared and analyzed. The results show that with other conditions being the same, as the REL_SPEED increased from the low (0–20 mph) to middle (20–45 mph) and then to the high range (45–75 mph), the best emergency decision with the minimum crash injury severity will gradually transition from braking to turning and then to braking + turning.

Download Full-text

Predicting Crash Injury Severity with Machine Learning Algorithm Synergized with Clustering Technique: A Promising Protocol

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph17155497 ◽

2020 ◽

Vol 17 (15) ◽

pp. 5497

Author(s):

Khaled Assi ◽

Syed Masiur Rahman ◽

Umer Mansoor ◽

Nedal Ratrout

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Clustering Algorithm ◽

Injury Severity ◽

Support Vector ◽

Severity Level ◽

Feed Forward ◽

Fuzzy C Means ◽

Severity Prediction ◽

Crash Injury Severity

Predicting crash injury severity is a crucial constituent of reducing the consequences of traffic crashes. This study developed machine learning (ML) models to predict crash injury severity using 15 crash-related parameters. Separate ML models for each cluster were obtained using fuzzy c-means, which enhanced the predicting capability. Finally, four ML models were developed: feed-forward neural networks (FNN), support vector machine (SVM), fuzzy C-means clustering based feed-forward neural network (FNN-FCM), and fuzzy c-means based support vector machine (SVM-FCM). Features that were easily identified with little investigation on crash sites were used as an input so that the trauma center can predict the crash severity level based on the initial information provided from the crash site and prepare accordingly for the treatment of the victims. The input parameters mainly include vehicle attributes and road condition attributes. This study used the crash database of Great Britain for the years 2011–2016. A random sample of crashes representing each year was used considering the same share of severe and non-severe crashes. The models were compared based on injury severity prediction accuracy, sensitivity, precision, and harmonic mean of sensitivity and precision (i.e., F1 score). The SVM-FCM model outperformed the other developed models in terms of accuracy and F1 score in predicting the injury severity level of severe and non-severe crashes. This study concluded that the FCM clustering algorithm enhanced the prediction power of FNN and SVM models.

Download Full-text

Using support vector machine models for crash injury severity analysis

Accident Analysis & Prevention ◽

10.1016/j.aap.2011.08.016 ◽

2012 ◽

Vol 45 ◽

pp. 478-486 ◽

Cited By ~ 115

Author(s):

Zhibin Li ◽

Pan Liu ◽

Wei Wang ◽

Chengcheng Xu

Keyword(s):

Support Vector Machine ◽

Injury Severity ◽

Support Vector ◽

Crash Injury Severity

Download Full-text

Intuitionistic Fuzzy Laplacian Twin Support Vector Machine for Semi-supervised Classification

Journal of the Operations Research Society of China ◽

10.1007/s40305-021-00354-9 ◽

2021 ◽

Author(s):

Jia-Bin Zhou ◽

Yan-Qin Bai ◽

Yan-Ru Guo ◽

Hai-Xiang Lin

Keyword(s):

Support Vector Machine ◽

Negative Impact ◽

Twin Support Vector Machine ◽

Fuzzy Membership ◽

Support Vector ◽

Membership Functions ◽

Fuzzy Membership Functions ◽

Intuitionistic Fuzzy ◽

Benchmark Datasets ◽

The Impact

AbstractIn general, data contain noises which come from faulty instruments, flawed measurements or faulty communication. Learning with data in the context of classification or regression is inevitably affected by noises in the data. In order to remove or greatly reduce the impact of noises, we introduce the ideas of fuzzy membership functions and the Laplacian twin support vector machine (Lap-TSVM). A formulation of the linear intuitionistic fuzzy Laplacian twin support vector machine (IFLap-TSVM) is presented. Moreover, we extend the linear IFLap-TSVM to the nonlinear case by kernel function. The proposed IFLap-TSVM resolves the negative impact of noises and outliers by using fuzzy membership functions and is a more accurate reasonable classifier by using the geometric distribution information of labeled data and unlabeled data based on manifold regularization. Experiments with constructed artificial datasets, several UCI benchmark datasets and MNIST dataset show that the IFLap-TSVM has better classification accuracy than other state-of-the-art twin support vector machine (TSVM), intuitionistic fuzzy twin support vector machine (IFTSVM) and Lap-TSVM.

Download Full-text

Classification and Regression Models for Genomic Selection of Skewed Phenotypes: A Case for Disease Resistance in Winter Wheat (Triticum aestivum L.)

10.1101/2021.12.16.472985 ◽

2021 ◽

Author(s):

Lance F Merrick ◽

Dennis N Lozada ◽

Xianming Chen ◽

Arron H Carter

Keyword(s):

Support Vector Machine ◽

Winter Wheat ◽

Genomic Selection ◽

Stripe Rust ◽

Regression Models ◽

Prediction Models ◽

Support Vector ◽

Classification Models ◽

Breeding Lines ◽

Classification And Regression

Most genomic prediction models are linear regression models that assume continuous and normally distributed phenotypes, but responses to diseases such as stripe rust (caused by Puccinia striiformis f. sp. tritici) are commonly recorded in ordinal scales and percentages. Disease severity (SEV) and infection type (IT) data in germplasm screening nurseries generally do not follow these assumptions. On this regard, researchers may ignore the lack of normality, transform the phenotypes, use generalized linear models, or use supervised learning algorithms and classification models with no restriction on the distribution of response variables, which are less sensitive when modeling ordinal scores. The goal of this research was to compare classification and regression genomic selection models for skewed phenotypes using stripe rust SEV and IT in winter wheat. We extensively compared both regression and classification prediction models using two training populations composed of breeding lines phenotyped in four years (2016-2018, and 2020) and a diversity panel phenotyped in four years (2013-2016). The prediction models used 19,861 genotyping-by-sequencing single-nucleotide polymorphism markers. Overall, square root transformed phenotypes using rrBLUP and support vector machine regression models displayed the highest combination of accuracy and relative efficiency across the regression and classification models. Further, a classification system based on support vector machine and ordinal Bayesian models with a 2-Class scale for SEV reached the highest class accuracy of 0.99. This study showed that breeders can use linear and non-parametric regression models within their own breeding lines over combined years to accurately predict skewed phenotypes.

Download Full-text

Algorithmic and data modeling: Will algorithmic modeling improve predictions of traits evaluated on ordinal scales?

10.1101/2020.10.07.329466 ◽

2020 ◽

Author(s):

Zhanyou Xu ◽

Andreomar Kurek ◽

Steven B. Cannon ◽

Williams D. Beavis

Keyword(s):

Support Vector Machine ◽

Random Forest ◽

Ridge Regression ◽

Genomic Prediction ◽

Ordinal Data ◽

Prediction Models ◽

Characteristic Curve ◽

Gradient Boosting ◽

Support Vector ◽

Data Types

AbstractSelection of markers linked to alleles at quantitative trait loci (QTL) for tolerance to Iron Deficiency Chlorosis (IDC) has not been successful. Genomic selection has been advocated for continuous numeric traits such as yield and plant height. For ordinal data types such as IDC, genomic prediction models have not been systematically compared. The objectives of research reported in this manuscript were to evaluate the most commonly used genomic prediction method, ridge regression and it’s equivalent logistic ridge regression method, with algorithmic modeling methods including random forest, gradient boosting, support vector machine, K-nearest neighbors, Naïve Bayes, and artificial neural network using the usual comparator metric of prediction accuracy. In addition we compared the methods using metrics of greater importance for decisions about selecting and culling lines for use in variety development and genetic improvement projects. These metrics include specificity, sensitivity, precision, decision accuracy, and area under the receiver operating characteristic curve. We found that Support Vector Machine provided the best specificity for culling IDC susceptible lines, while Random Forest GP models provided the best combined set of decision metrics for retaining IDC tolerant and culling IDC susceptible lines.

Download Full-text

Understanding nitrogen transport in the unsaturated zone with fluctuations in groundwater depth

Water Science & Technology Water Supply ◽

10.2166/ws.2021.066 ◽

2021 ◽

Author(s):

Jianmin Bian ◽

Qian Wang ◽

Siyu Nie ◽

Hanli Wan ◽

Juanjuan Wu

Keyword(s):

Support Vector Machine ◽

Unsaturated Zone ◽

Prediction Models ◽

Groundwater Depth ◽

Support Vector ◽

Nitrogen Transport ◽

Irrigation Area ◽

Leaching Loss ◽

The Relationship ◽

Hydrus 1D

Abstract Fluctuations in groundwater depth play an important role and are often overlooked when considering the transport of nitrogen in the unsaturated zone. To evaluate directly the variation of nitrogen transport due to fluctuations in groundwater depth, the prediction model of groundwater depth and nitrogen transport were combined and applied by least squares support vector machine and Hydrus-1D in the western irrigation area of Jilin in China. The calibration and testing results showed the prediction models were reliable. Considering different groundwater depth, the concentration of nitrogen was affected significantly with a groundwater depth of 3.42–1.71 m, while it was not affected with groundwater depth of 5.48–6.47 m. The total leaching loss of nitrogen gradually increased with the continuous decrease of groundwater depth. Furthermore, the limited groundwater depth of 1.7 m was found to reduce the risk of nitrogen pollution. This paper systematically analyzes the relationship between groundwater depth and nitrogen transport to form appropriate agriculture strategies.

Download Full-text

Exploring the Mechanism of Crashes with Autonomous Vehicles Using Machine Learning

Mathematical Problems in Engineering ◽

10.1155/2021/5524356 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Hengrui Chen ◽

Hong Chen ◽

Ruiyu Zhou ◽

Zhizhen Liu ◽

Xiaoke Sun

Keyword(s):

Machine Learning ◽

Autonomous Vehicles ◽

Classification And Regression Tree ◽

Gradient Boosting ◽

Support Vector ◽

Crash Severity ◽

Apriori Algorithm ◽

Driving Mode ◽

Extreme Gradient Boosting ◽

The Impact

The safety issue has become a critical obstacle that cannot be ignored in the marketization of autonomous vehicles (AVs). The objective of this study is to explore the mechanism of AV-involved crashes and analyze the impact of each feature on crash severity. We use the Apriori algorithm to explore the causal relationship between multiple factors to explore the mechanism of crashes. We use various machine learning models, including support vector machine (SVM), classification and regression tree (CART), and eXtreme Gradient Boosting (XGBoost), to analyze the crash severity. Besides, we apply the Shapley Additive Explanations (SHAP) to interpret the importance of each factor. The results indicate that XGBoost obtains the best result (recall = 75%; G-mean = 67.82%). Both XGBoost and Apriori algorithm effectively provided meaningful insights about AV-involved crash characteristics and their relationship. Among all these features, vehicle damage, weather conditions, accident location, and driving mode are the most critical features. We found that most rear-end crashes are conventional vehicles bumping into the rear of AVs. Drivers should be extremely cautious when driving in fog, snow, and insufficient light. Besides, drivers should be careful when driving near intersections, especially in the autonomous driving mode.

Download Full-text

Investigation of the Impact of Data Comparability on Performance of Support Vector Machine Models for Credit Scoring

Innovation and Supply Chain Management ◽

10.14327/iscm.9.31 ◽

2015 ◽

Vol 9 (1) ◽

pp. 31-38 ◽

Cited By ~ 3

Author(s):

Yanwen DONG ◽

Xiying HAO ◽

Hideo SATO

Keyword(s):

Support Vector Machine ◽

Credit Scoring ◽

Support Vector ◽

Data Comparability ◽

The Impact

Download Full-text

The Influence of Inhomogeneous Input Data from Different Waves on Predictive Model Development for COVID-19 ICU Patients (Preprint)

10.2196/preprints.31539 ◽

2021 ◽

Author(s):

Sebastian Johannes Fritsch ◽

Konstantin Sharafutdinov ◽

Moein Einollahzadeh Samadi ◽

Gernot Marx ◽

Andreas Schuppert ◽

...

Keyword(s):

Machine Learning ◽

Convex Hull ◽

Prediction Models ◽

Model Development ◽

Predictive Performance ◽

Support Vector ◽

Good Prediction ◽

The Impact ◽

Second Wave ◽

Over Time

BACKGROUND During the course of the COVID-19 pandemic, a variety of machine learning models were developed to predict different aspects of the disease, such as long-term causes, organ dysfunction or ICU mortality. The number of training datasets used has increased significantly over time. However, these data now come from different waves of the pandemic, not always addressing the same therapeutic approaches over time as well as changing outcomes between two waves. The impact of these changes on model development has not yet been studied. OBJECTIVE The aim of the investigation was to examine the predictive performance of several models trained with data from one wave predicting the second wave´s data and the impact of a pooling of these data sets. Finally, a method for comparison of different datasets for heterogeneity is introduced. METHODS We used two datasets from wave one and two to develop several predictive models for mortality of the patients. Four classification algorithms were used: logistic regression (LR), support vector machine (SVM), random forest classifier (RF) and AdaBoost classifier (ADA). We also performed a mutual prediction on the data of that wave which was not used for training. Then, we compared the performance of models when a pooled dataset from two waves was used. The populations from the different waves were checked for heterogeneity using a convex hull analysis. RESULTS 63 patients from wave one (03-06/2020) and 54 from wave two (08/2020-01/2021) were evaluated. For both waves separately, we found models reaching sufficient accuracies up to 0.79 AUROC (95%-CI 0.76-0.81) for SVM on the first wave and up 0.88 AUROC (95%-CI 0.86-0.89) for RF on the second wave. After the pooling of the data, the AUROC decreased relevantly. In the mutual prediction, models trained on second wave´s data showed, when applied on first wave´s data, a good prediction for non-survivors but an insufficient classification for survivors. The opposite situation (training: first wave, test: second wave) revealed the inverse behaviour with models correctly classifying survivors and incorrectly predicting non-survivors. The convex hull analysis for the first and second wave populations showed a more inhomogeneous distribution of underlying data when compared to randomly selected sets of patients of the same size. CONCLUSIONS Our work demonstrates that a larger dataset is not a universal solution to all machine learning problems in clinical settings. Rather, it shows that inhomogeneous data used to develop models can lead to serious problems. With the convex hull analysis, we offer a solution for this problem. The outcome of such an analysis can raise concerns if the pooling of different datasets would cause inhomogeneous patterns preventing a better predictive performance.

Download Full-text

Crash Data-Based Investigation into How Injury Severity Is Affected by Driver Errors

Transportation Research Record Journal of the Transportation Research Board ◽

10.1177/0361198120916469 ◽

2020 ◽

Vol 2674 (5) ◽

pp. 452-464 ◽

Cited By ~ 1

Author(s):

Mohammad Razaur Rahman Shaon ◽

Xiao Qin

Keyword(s):

Injury Severity ◽

Model Performance ◽

Severe Injuries ◽

Crash Data ◽

Severity Prediction ◽

Safety Interventions ◽

Driving Task ◽

Driver Error ◽

Crash Characteristics ◽

Crash Injury Severity

Unsafe driving behaviors, driver limitations, and conditions that lead to a crash are usually referred to as driver errors. Even though driver errors are widely cited as a critical reason for crash occurrence in crash reports and safety literature, the discussion on their consequences is limited. This study aims to quantify the effect of driver errors on crash injury severity. To assist this investigation, driver errors were categorized as sequential events in a driving task. Possible combinations of driver error categories were created and ranked based on statistical dependences between error combinations and injury severity levels. Binary logit models were then developed to show that typical variables used to model injury severity such as driver characteristics, roadway characteristics, environmental factors, and crash characteristics are inadequate to explain driver errors, especially the complicated ones. Next, ordinal probit models were applied to quantify the effect of driver errors on injury severity for rural crashes. Superior model performance is observed when driver error combinations were modeled along with typical crash variables to predict the injury outcome. Modeling results also illustrate that more severe crashes tend to occur when the driver makes multiple mistakes. Therefore, incorporating driver errors in crash injury severity prediction not only improves prediction accuracy but also enhances our understanding of what error(s) may lead to more severe injuries so that safety interventions can be recommended accordingly.

Download Full-text