Dimensionality reduction by soft-margin support vector machine

Author(s):  
Ruipeng Dong ◽  
Hua Meng ◽  
Zhiguo Long ◽  
Hailiang Zhao
2022 ◽  
pp. 146808742110707
Author(s):  
Aran Mohammad ◽  
Reza Rezaei ◽  
Christopher Hayduk ◽  
Thaddaeus Delebinski ◽  
Saeid Shahpouri ◽  
...  

The development of internal combustion engines is affected by the exhaust gas emissions legislation and the striving to increase performance. This demands for engine-out emission models that can be used for engine optimization for real driving emission controls. The prediction capability of physically and data-driven engine-out emission models is influenced by the system inputs, which are specified by the user and can lead to an improved accuracy with increasing number of inputs. Thereby the occurrence of irrelevant inputs becomes more probable, which have a low functional relation to the emissions and can lead to overfitting. Alternatively, data-driven methods can be used to detect irrelevant and redundant inputs. In this work, thermodynamic states are modeled based on 772 stationary measured test bench data from a commercial vehicle diesel engine. Afterward, 37 measured and modeled variables are led into a data-driven dimensionality reduction. For this purpose, approaches of supervised learning, such as lasso regression and linear support vector machine, and unsupervised learning methods like principal component analysis and factor analysis are applied to select and extract the relevant features. The selected and extracted features are used for regression by the support vector machine and the feedforward neural network to model the NOx, CO, HC, and soot emissions. This enables an evaluation of the modeling accuracy as a result of the dimensionality reduction. Using the methods in this work, the 37 variables are reduced to 25, 22, 11, and 16 inputs for NOx, CO, HC, and soot emission modeling while maintaining the accuracy. The features selected using the lasso algorithm provide more accurate learning of the regression models than the extracted features through principal component analysis and factor analysis. This results in test errors RMSETe for modeling NOx, CO, HC, and soot emissions 19.22 ppm, 6.46 ppm, 1.29 ppm, and 0.06 FSN, respectively.


Author(s):  
Hong Peng ◽  
Meixiang Luo ◽  
Weihang Lv ◽  
Linkai Luo

2003 ◽  
Vol 55 (1-2) ◽  
pp. 321-336 ◽  
Author(s):  
L.J. Cao ◽  
K.S. Chua ◽  
W.K. Chong ◽  
H.P. Lee ◽  
Q.M. Gu

2018 ◽  
Vol 18 (1) ◽  
pp. 21-31 ◽  
Author(s):  
Lanlan Wu ◽  
Qiaohua Wang ◽  
Dengfei Jie ◽  
Shucai Wang ◽  
Zhihui Zhu ◽  
...  

Sensors ◽  
2021 ◽  
Vol 21 (13) ◽  
pp. 4519
Author(s):  
Livia Petrescu ◽  
Cătălin Petrescu ◽  
Ana Oprea ◽  
Oana Mitruț ◽  
Gabriela Moise ◽  
...  

This paper focuses on the binary classification of the emotion of fear, based on the physiological data and subjective responses stored in the DEAP dataset. We performed a mapping between the discrete and dimensional emotional information considering the participants’ ratings and extracted a substantial set of 40 types of features from the physiological data, which represented the input to various machine learning algorithms—Decision Trees, k-Nearest Neighbors, Support Vector Machine and artificial networks—accompanied by dimensionality reduction, feature selection and the tuning of the most relevant hyperparameters, boosting classification accuracy. The methodology we approached included tackling different situations, such as resolving the problem of having an imbalanced dataset through data augmentation, reducing overfitting, computing various metrics in order to obtain the most reliable classification scores and applying the Local Interpretable Model-Agnostic Explanations method for interpretation and for explaining predictions in a human-understandable manner. The results show that fear can be predicted very well (accuracies ranging from 91.7% using Gradient Boosting Trees to 93.5% using dimensionality reduction and Support Vector Machine) by extracting the most relevant features from the physiological data and by searching for the best parameters which maximize the machine learning algorithms’ classification scores.


Sign in / Sign up

Export Citation Format

Share Document