Feature Extraction and Classification for the Detection of Knee Joint Disorders using Random Forest Classifier

A non-invasive technique using knee joint vibroarthrographic (VAG) signals can be used for the early diagnosis of knee joint disorders. Among the algorithms devised for the detection of knee joint disorders using VAG signals, algorithms based on entropy measures can provide better performance. In this work, the VAG signal is preprocessed using wavelet decomposition into sub band signals. Features of the decomposed sub bands such as approximate entropy, sample entropy and wavelet energy are extracted as a quantified measure of complexity of the signal. A feature selection based on Principal Component Analysis (PCA) is performed in order to select the significant features. The extracted features are then used for classification of VAG signal into normal and abnormal VAG using random forest classifier. It is observed that the classifier provides a better accuracy with feature selection using principal component analysis. And the result shows that the classifier is able to classify the signal with an accuracy of 87%, error rate of 0.13, sensitivity of 0.874 and specificity of 0.777.

Download Full-text

Detection of Knee Joint Disorders using SVM Classifier

International Journal of Scientific Research in Science and Technology ◽

10.32628/ijsrst218535 ◽

2021 ◽

pp. 261-271

Author(s):

Alphonsa Salu S. J. ◽

Jeraldin Auxillia D

Keyword(s):

Principal Component Analysis ◽

Feature Selection ◽

Knee Joint ◽

Principal Component ◽

Approximate Entropy ◽

Component Analysis ◽

Invasive Technique ◽

Support Vector ◽

Svm Classifier ◽

Entropy Measures

A non-invasive technique using knee joint vibroarthographic (VAG) signals can be used for the early diagnosis of knee joint disorders. Among the algorithms devised for the detection of knee joint disorders using VAG signals, algorithms based on entropy measures can provide better performance. In this work, the VAG signal is preprocessed using wavelet decomposition into sub band signals. Features of the decomposed sub bands such as approximate entropy, sample entropy & wavelet energy are extracted as a quantified measure of complexity of the signal. A feature selection based on Principal Component Analysis (PCA) is performed in order to select the significant features. The extracted features are then used for classification of VAG signal into normal and abnormal VAG using support vector machine. It is observed that the classifier provides a better accuracy with feature selection using principal component analysis. And the results show that the classifier was able to classify the signal with an accuracy of 82.6%, error rate of 0.174, sensitivity of 1.0 and specificity of 0.888.

Download Full-text

Towards a software defect proneness model: feature selection

Applied Aspects of Information Technology ◽

10.15276/aait.04.2021.5 ◽

2021 ◽

Vol 4 (4) ◽

pp. 354-365

Author(s):

Vitaliy S. Yakovyna ◽

◽

Ivan I. Symets

Keyword(s):

Principal Component Analysis ◽

Feature Selection ◽

Random Forest ◽

Software Reliability ◽

Principal Component ◽

Component Analysis ◽

Support Vector ◽

Tree Classifier ◽

Code Metrics ◽

Software Code

This article is focused on improving static models of software reliability based on using machine learning methods to select the software code metrics that most strongly affect its reliability. The study used a merged dataset from the PROMISE Software Engineering repository, which contained data on testing software modules of five programs and twenty-one code metrics. For the prepared sampling, the most important features that affect the quality of software code have been selected using the following methods of feature selection: Boruta, Stepwise selection, Exhaustive Feature Selection, Random Forest Importance, LightGBM Importance, Genetic Algorithms, Principal Component Analysis, Xverse python. Basing on the voting on the results of the work of the methods of feature selection, a static (deterministic) model of software reliability has been built, which establishes the relationship between the probability of a defect in the software module and the metrics of its code. It has been shown that this model includes such code metrics as branch count of a program, McCabe’s lines of code and cyclomatic complexity, Halstead’s total number of operators and operands, intelligence, volume, and effort value. A comparison of the effectiveness of different methods of feature selection has been put into practice, in particular, a study of the effect of the method of feature selection on the accuracy of classification using the following classifiers: Random Forest, Support Vector Machine, k-Nearest Neighbors, Decision Tree classifier, AdaBoost classifier, Gradient Boosting for classification. It has been shown that the use of any method of feature selection increases the accuracy of classification by at least ten percent compared to the original dataset, which confirms the importance of this procedure for predicting software defects based on metric datasets that contain a significant number of highly correlated software code metrics. It has been found that the best accuracy of the forecast for most classifiers was reached using a set of features obtained from the proposed static model of software reliability. In addition, it has been shown that it is also possible to use separate methods, such as Autoencoder, Exhaustive Feature Selection and Principal Component Analysis with an insignificant loss of classification and prediction accuracy

Download Full-text