Physical-oriented and machine learning-based emission modeling in a diesel compression ignition engine: Dimensionality reduction and regression

2022 ◽  
pp. 146808742110707
Author(s):  
Aran Mohammad ◽  
Reza Rezaei ◽  
Christopher Hayduk ◽  
Thaddaeus Delebinski ◽  
Saeid Shahpouri ◽  
...  

The development of internal combustion engines is affected by the exhaust gas emissions legislation and the striving to increase performance. This demands for engine-out emission models that can be used for engine optimization for real driving emission controls. The prediction capability of physically and data-driven engine-out emission models is influenced by the system inputs, which are specified by the user and can lead to an improved accuracy with increasing number of inputs. Thereby the occurrence of irrelevant inputs becomes more probable, which have a low functional relation to the emissions and can lead to overfitting. Alternatively, data-driven methods can be used to detect irrelevant and redundant inputs. In this work, thermodynamic states are modeled based on 772 stationary measured test bench data from a commercial vehicle diesel engine. Afterward, 37 measured and modeled variables are led into a data-driven dimensionality reduction. For this purpose, approaches of supervised learning, such as lasso regression and linear support vector machine, and unsupervised learning methods like principal component analysis and factor analysis are applied to select and extract the relevant features. The selected and extracted features are used for regression by the support vector machine and the feedforward neural network to model the NOx, CO, HC, and soot emissions. This enables an evaluation of the modeling accuracy as a result of the dimensionality reduction. Using the methods in this work, the 37 variables are reduced to 25, 22, 11, and 16 inputs for NOx, CO, HC, and soot emission modeling while maintaining the accuracy. The features selected using the lasso algorithm provide more accurate learning of the regression models than the extracted features through principal component analysis and factor analysis. This results in test errors RMSETe for modeling NOx, CO, HC, and soot emissions 19.22 ppm, 6.46 ppm, 1.29 ppm, and 0.06 FSN, respectively.

2019 ◽  
Vol 42 (7) ◽  
pp. 1301-1312
Author(s):  
Wen Wu ◽  
Shah Faisal

In recent years, with the development of artificial intelligence, data-driven methodologies have been widely studied in fault diagnosis and detection, since an increasing number of complexities of modern complex systems make the mechanism model information difficult to obtain. Especially in people’s health monitoring, it is very difficult to achieve the mechanism model. The existing challenges, such as huge amount of data, high data dimension, large noise interference, and so forth, make the applications of data-driven approaches more suitable. For the sake of solving the problems above, we present principal component analysis-support vector machine (PCA-SVM) method with different kernels to reduce data dimension, and two sets of breast-cancer data are utilized to verify the method. Additionally, support vector machine-recursive feature elimination (SVM-RFE), the original SVM with different kernels, PCA and modified PCA (MPCA) methods are also applied to diagnose malignant cancer in comparison with PCA-SVM. In experiments, PCA-SVM via radial basis function (RBF) kernel shows better performance than other methods, with the two breast cancer datasets obtained from the University of Wisconsin Hospital. Finally, PCA-SVM in this study uses only six principal components and obtains better accuracy (97.19%) than most of the previous studies.


Author(s):  
Ade Jamal ◽  
Annisa Handayani ◽  
Ali Akbar Septiandri ◽  
Endang Ripmiatin ◽  
Yunus Effendi

Breast cancer is the most important cause of death among women. A prediction of breast cancer in early stage provides a greater possibility of its cure. It needs a breast cancer prediction tool that can classify a breast tumor whether it was a harmful malignant tumor or un-harmful benign tumor. In this paper, two algorithms of machine learning, namely Support Vector Machine and Extreme Gradient Boosting technique will be compared for classification purpose. Prior to the classification, the number of data attribute will be reduced from the raw data by extracting features using Principal Component Analysis. A clustering method, namely K-Means is also used for dimensionality reduction besides the Principal Component Analysis. This paper will present a comparison among four models based on two dimensionality reduction methods combined with two classifiers which applied on Wisconsin Breast Cancer Dataset. The comparison will be measured by using accuracy, sensitivity and specificity metrics evaluated from the confusion matrices. The experimental results have indicated that the K-Means method, which is not usually used for dimensionality reduction can perform well compared to the popular Principal Component Analysis.


Sign in / Sign up

Export Citation Format

Share Document