Mahalanobis-ANOVA criterion for optimum feature subset selection in multi-class planetary gear fault diagnosis

2021 ◽  
pp. 107754632110291
Author(s):  
Setti Suresh ◽  
VPS Naidu

The empirical analysis of a typical gear fault diagnosis of five different classes has been studied in this article. The analysis was used to develop novel feature selection criteria that provide an optimum feature subset over feature ranking genetic algorithms for improving the planetary gear fault classification accuracy. We have considered traditional approach in the fault diagnosis, where the raw vibration signal was divided into fixed-length epochs, and statistical time-domain features have been extracted from the segmented signal to represent the data in a compact discriminative form. Scale-invariant Mahalanobis distance–based feature selection using ANOVA statistic test was used as a feature selection criterion to find out the optimum feature subset. The Support Vector Machine Multi-Class machine learning algorithm was used as a classification technique to diagnose the gear faults. It has been observed that the highest gear fault classification accuracy of 99.89% (load case) was achieved by using the proposed Mahalanobis-ANOVA Criterion for optimum feature subset selection followed by Support Vector Machine Multi-Class algorithm. It is also noted that the developed feature selection criterion is a data-driven model which will contemplate all the nonlinearity in a signal. The fault diagnosis consistency of the proposed Support Vector Machine Multi-Class learning algorithm was ensured through 100 Monte Carlo runs, and the diagnostic ability of the classifier has been represented using confusion matrix and receiver operating characteristics.

Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-18 ◽  
Author(s):  
Mohammad Aljanabi ◽  
Mohd Arfian Ismail ◽  
Vitaly Mezhuyev

Many optimisation-based intrusion detection algorithms have been developed and are widely used for intrusion identification. This condition is attributed to the increasing number of audit data features and the decreasing performance of human-based smart intrusion detection systems regarding classification accuracy, false alarm rate, and classification time. Feature selection and classifier parameter tuning are important factors that affect the performance of any intrusion detection system. In this paper, an improved intrusion detection algorithm for multiclass classification was presented and discussed in detail. The proposed method combined the improved teaching-learning-based optimisation (ITLBO) algorithm, improved parallel JAYA (IPJAYA) algorithm, and support vector machine. ITLBO with supervised machine learning (ML) technique was used for feature subset selection (FSS). The selection of the least number of features without causing an effect on the result accuracy in FSS is a multiobjective optimisation problem. This work proposes ITLBO as an FSS mechanism, and its algorithm-specific, parameterless concept (no parameter tuning is required during optimisation) was explored. IPJAYA in this study was used to update the C and gamma parameters of the support vector machine (SVM). Several experiments were performed on the prominent intrusion ML dataset, where significant enhancements were observed with the suggested ITLBO-IPJAYA-SVM algorithm compared with the classical TLBO and JAYA algorithms.


Electronics ◽  
2021 ◽  
Vol 10 (12) ◽  
pp. 1496
Author(s):  
Hao Liang ◽  
Yiman Zhu ◽  
Dongyang Zhang ◽  
Le Chang ◽  
Yuming Lu ◽  
...  

In analog circuit, the component parameters have tolerances and the fault component parameters present a wide distribution, which brings obstacle to classification diagnosis. To tackle this problem, this article proposes a soft fault diagnosis method combining the improved barnacles mating optimizer(BMO) algorithm with the support vector machine (SVM) classifier, which can achieve the minimum redundancy and maximum relevance for feature dimension reduction with fuzzy mutual information. To be concrete, first, the improved barnacles mating optimizer algorithm is used to optimize the parameters for learning and classification. We adopt six test functions that are on three data sets from the University of California, Irvine (UCI) machine learning repository to test the performance of SVM classifier with five different optimization algorithms. The results show that the SVM classifier combined with the improved barnacles mating optimizer algorithm is characterized with high accuracy in classification. Second, fuzzy mutual information, enhanced minimum redundancy, and maximum relevance principle are applied to reduce the dimension of the feature vector. Finally, a circuit experiment is carried out to verify that the proposed method can achieve fault classification effectively when the fault parameters are both fixed and distributed. The accuracy of the proposed fault diagnosis method is 92.9% when the fault parameters are distributed, which is 1.8% higher than other classifiers on average. When the fault parameters are fixed, the accuracy rate is 99.07%, which is 0.7% higher than other classifiers on average.


PLoS ONE ◽  
2021 ◽  
Vol 16 (8) ◽  
pp. e0255307
Author(s):  
Fujun Wang ◽  
Xing Wang

Feature selection is an important task in big data analysis and information retrieval processing. It reduces the number of features by removing noise, extraneous data. In this paper, one feature subset selection algorithm based on damping oscillation theory and support vector machine classifier is proposed. This algorithm is called the Maximum Kendall coefficient Maximum Euclidean Distance Improved Gray Wolf Optimization algorithm (MKMDIGWO). In MKMDIGWO, first, a filter model based on Kendall coefficient and Euclidean distance is proposed, which is used to measure the correlation and redundancy of the candidate feature subset. Second, the wrapper model is an improved grey wolf optimization algorithm, in which its position update formula has been improved in order to achieve optimal results. Third, the filter model and the wrapper model are dynamically adjusted by the damping oscillation theory to achieve the effect of finding an optimal feature subset. Therefore, MKMDIGWO achieves both the efficiency of the filter model and the high precision of the wrapper model. Experimental results on five UCI public data sets and two microarray data sets have demonstrated the higher classification accuracy of the MKMDIGWO algorithm than that of other four state-of-the-art algorithms. The maximum ACC value of the MKMDIGWO algorithm is at least 0.5% higher than other algorithms on 10 data sets.


Author(s):  
Ricco Rakotomalala ◽  
Faouzi Mhamdi

In this chapter, we are interested in proteins classification starting from their primary structures. The goal is to automatically affect proteins sequences to their families. The main originality of the approach is that we directly apply the text categorization framework for the protein classification with very minor modifications. The main steps of the task are clearly identified: we must extract features from the unstructured dataset, we use the fixed length n-grams descriptors; we select and combine the most relevant one for the learning phase; and then, we select the most promising learning algorithm in order to produce accurate predictive model. We obtain essentially two main results. First, the approach is credible, giving accurate results with only 2-grams descriptors length. Second, in our context where many irrelevant descriptors are automatically generated, we must combine aggressive feature selection algorithms and low variance classifiers such as SVM (Support Vector Machine).


Author(s):  
Alok Kumar Shukla ◽  
Pradeep Singh ◽  
Manu Vardhan

The explosion of the high-dimensional dataset in the scientific repository has been encouraging interdisciplinary research on data mining, pattern recognition and bioinformatics. The fundamental problem of the individual Feature Selection (FS) method is extracting informative features for classification model and to seek for the malignant disease at low computational cost. In addition, existing FS approaches overlook the fact that for a given cardinality, there can be several subsets with similar information. This paper introduces a novel hybrid FS algorithm, called Filter-Wrapper Feature Selection (FWFS) for a classification problem and also addresses the limitations of existing methods. In the proposed model, the front-end filter ranking method as Conditional Mutual Information Maximization (CMIM) selects the high ranked feature subset while the succeeding method as Binary Genetic Algorithm (BGA) accelerates the search in identifying the significant feature subsets. One of the merits of the proposed method is that, unlike an exhaustive method, it speeds up the FS procedure without lancing of classification accuracy on reduced dataset when a learning model is applied to the selected subsets of features. The efficacy of the proposed (FWFS) method is examined by Naive Bayes (NB) classifier which works as a fitness function. The effectiveness of the selected feature subset is evaluated using numerous classifiers on five biological datasets and five UCI datasets of a varied dimensionality and number of instances. The experimental results emphasize that the proposed method provides additional support to the significant reduction of the features and outperforms the existing methods. For microarray data-sets, we found the lowest classification accuracy is 61.24% on SRBCT dataset and highest accuracy is 99.32% on Diffuse large B-cell lymphoma (DLBCL). In UCI datasets, the lowest classification accuracy is 40.04% on the Lymphography using k-nearest neighbor (k-NN) and highest classification accuracy is 99.05% on the ionosphere using support vector machine (SVM).


Author(s):  
Maria Mohammad Yousef ◽  

Generally, medical dataset classification has become one of the biggest problems in data mining research. Every database has a given number of features but it is observed that some of these features can be redundant and can be harmful as well as disrupt the process of classification and this problem is known as a high dimensionality problem. Dimensionality reduction in data preprocessing is critical for increasing the performance of machine learning algorithms. Besides the contribution of feature subset selection in dimensionality reduction gives a significant improvement in classification accuracy. In this paper, we proposed a new hybrid feature selection approach based on (GA assisted by KNN) to deal with issues of high dimensionality in biomedical data classification. The proposed method first applies the combination between GA and KNN for feature selection to find the optimal subset of features where the classification accuracy of the k-Nearest Neighbor (kNN) method is used as the fitness function for GA. After selecting the best-suggested subset of features, Support Vector Machine (SVM) are used as the classifiers. The proposed method experiments on five medical datasets of the UCI Machine Learning Repository. It is noted that the suggested technique performs admirably on these databases, achieving higher classification accuracy while using fewer features.


2012 ◽  
Vol 455-456 ◽  
pp. 1169-1174
Author(s):  
Jia Li Tang ◽  
Chen Rong Huang ◽  
Jian Min Zuo

2011 ◽  
Vol 66-68 ◽  
pp. 1982-1987
Author(s):  
Wei Niu ◽  
Guo Qing Wang ◽  
Zheng Jun Zhai ◽  
Juan Cheng

The vibration signals of rotating machinery in operation consist of plenty of information about its running condition, and extraction and identification of fault signals in the process of speed change are necessary for the fault diagnosis of rotating machinery. This paper improves DDAG classification method and proposes a new fault diagnosis model based on support vector machine to solve the problem of restricting the rotating machinery fault intelligent diagnosis due to the lack of fault data samples. The testing results demonstrate that the model has good classification precision and can correctly diagnose faults.


Sign in / Sign up

Export Citation Format

Share Document