Identifying discriminative features for diagnosis of Kashin-Beck disease among adolescents

Abstract Introduction Diagnosing Kashin-Beck disease (KBD) involves damages to multiple joints and carries variable clinical symptoms, posing great challenge to the diagnosis of KBD for clinical practitioners. However, it is still unclear which clinical features of KBD are more informative for the diagnosis of Kashin-Beck disease among adolescent. Methods We first manually extracted 26 possible features including clinical manifestations, and pathological changes of X-ray images from 400 KBD and 400 non-KBD adolescents. With such features, we performed four classification methods, i.e., random forest algorithms (RFA), artificial neural networks (ANNs), support vector machines (SVMs) and linear regression (LR) with four feature selection methods, i.e., RFA, minimum redundancy maximum relevance (mRMR), support vector machine recursive feature elimination (SVM—RFE) and Relief. The performance of diagnosis of KBD with respect to different classification models were evaluated by sensitivity, specificity, accuracy, and the area under the receiver operating characteristic (ROC) curve (AUC). Results Our results demonstrated that the 10 out of 26 discriminative features were displayed more powerful performance, regardless of the chosen of classification models and feature selection methods. These ten discriminative features were distal end of phalanges alterations, metaphysis alterations and carpals alterations and clinical manifestations of ankle joint movement limitation, enlarged finger joints, flexion of the distal part of fingers, elbow joint movement limitation, squatting limitation, deformed finger joints, wrist joint movement limitation. Conclusions The selected ten discriminative features could provide a fast, effective diagnostic standard for KBD adolescents.

Download Full-text

Comparing Methods of Feature Extraction of Brain Activities for Octave Illusion Classification Using Machine Learning

Sensors ◽

10.3390/s21196407 ◽

2021 ◽

Vol 21 (19) ◽

pp. 6407

Author(s):

Nina Pilyugina ◽

Akihiko Tsukahara ◽

Keita Tanaka

Keyword(s):

Machine Learning ◽

Feature Extraction ◽

Feature Selection ◽

Principal Component ◽

Machine Learning Algorithms ◽

Recursive Feature Elimination ◽

Support Vector ◽

Selection Methods ◽

Automatic Feature Extraction ◽

Octave Illusion

The aim of this study was to find an efficient method to determine features that characterize octave illusion data. Specifically, this study compared the efficiency of several automatic feature selection methods for automatic feature extraction of the auditory steady-state responses (ASSR) data in brain activities to distinguish auditory octave illusion and nonillusion groups by the difference in ASSR amplitudes using machine learning. We compared univariate selection, recursive feature elimination, principal component analysis, and feature importance by testifying the results of feature selection methods by using several machine learning algorithms: linear regression, random forest, and support vector machine. The univariate selection with the SVM as the classification method showed the highest accuracy result, 75%, compared to 66.6% without using feature selection. The received results will be used for future work on the explanation of the mechanism behind the octave illusion phenomenon and creating an algorithm for automatic octave illusion classification.

Download Full-text

Survival Prediction and Feature Selection in Patients with Breast Cancer Using Support Vector Regression

Computational and Mathematical Methods in Medicine ◽

10.1155/2016/2157984 ◽

2016 ◽

Vol 2016 ◽

pp. 1-12 ◽

Cited By ~ 9

Author(s):

Shahrbanoo Goli ◽

Hossein Mahjub ◽

Javad Faradmal ◽

Hoda Mashayekhi ◽

Ali-Reza Soltanian

Keyword(s):

Breast Cancer ◽

Feature Selection ◽

Support Vector Regression ◽

Cox Model ◽

Statistical Tests ◽

Recursive Feature Elimination ◽

Support Vector ◽

Survival Prediction ◽

Selection Methods ◽

Better Than

The Support Vector Regression (SVR) model has been broadly used for response prediction. However, few researchers have used SVR for survival analysis. In this study, a new SVR model is proposed and SVR with different kernels and the traditional Cox model are trained. The models are compared based on different performance measures. We also select the best subset of features using three feature selection methods: combination of SVR and statistical tests, univariate feature selection based on concordance index, and recursive feature elimination. The evaluations are performed using available medical datasets and also a Breast Cancer (BC) dataset consisting of 573 patients who visited the Oncology Clinic of Hamadan province in Iran. Results show that, for the BC dataset, survival time can be predicted more accurately by linear SVR than nonlinear SVR. Based on the three feature selection methods, metastasis status, progesterone receptor status, and human epidermal growth factor receptor 2 status are the best features associated to survival. Also, according to the obtained results, performance of linear and nonlinear kernels is comparable. The proposed SVR model performs similar to or slightly better than other models. Also, SVR performs similar to or better than Cox when all features are included in model.

Download Full-text

Classification of Gene Expression Data Using Feature Selection Based on Type Combination Approach Model With Advanced Feature Selection Technology

International Journal of Cognitive Informatics and Natural Intelligence ◽

10.4018/ijcini.20211001.oa46 ◽

2021 ◽

Vol 15 (4) ◽

pp. 1-18

Author(s):

Siddesh G. M. ◽

Gururaj T.

Keyword(s):

Feature Selection ◽

Classification Accuracy ◽

Error Rates ◽

Recursive Feature Elimination ◽

Support Vector ◽

Selection Methods ◽

Maximum Information ◽

Combination Approach ◽

Selection Of

A key step in addressing the classification issue was the selection of genes for removing redundant and irrelevant genes. The proposed Type Combination Approach –Feature Selection(TCA-FS) model uses the efficient feature selection methods, and the classification accuracy can be enhanced. The three classifiers such as K Nearest Neighbour(KNN), Support Vector Machine(SVM) and Random Forest(RF) are selected for evaluating the opted feature selection methods, and prediction accuracy. The effects of three new approaches for feature selection are Improved Recursive Feature Elimination (IRFE), Revised Maximum Information co-efficient (RMIC), as well as Upgraded Masked Painter (UMP), are analysed. These three proposed techniques are compared with existing techniques and are validated with (i) Stability determination test. (ii) Classification accuracy. (iii) Error rates of three proposed techniques are analysed. Due to the selection of proper threshold on classification, the proposed TCA-FS method provides a higher accuracy compared to the existing system.

Download Full-text

Classification of Gene Expression Data Using Feature Selection Based on Type Combination Approach Model with Advanced Feature Selection Techn

International Journal of Cognitive Informatics and Natural Intelligence ◽

10.4018/ijcini.20211001oa34 ◽

2021 ◽

Vol 15 (4) ◽

pp. 0-0

Keyword(s):

Feature Selection ◽

Classification Accuracy ◽

Error Rates ◽

Recursive Feature Elimination ◽

Support Vector ◽

Selection Methods ◽

Maximum Information ◽

Combination Approach ◽

Selection Of

Download Full-text

A fuzzy gaussian rank aggregation ensemble feature selection method for microarray data

International Journal of Knowledge-based and Intelligent Engineering Systems ◽

10.3233/kes-190134 ◽

2021 ◽

Vol 24 (4) ◽

pp. 289-301

Author(s):

B. Venkatesh ◽

J. Anuradha

Keyword(s):

Feature Selection ◽

Microarray Data ◽

Classification Accuracy ◽

Performance Metrics ◽

Feature Selection Method ◽

Selection Method ◽

Support Vector ◽

Svm Classifier ◽

Binary Particle Swarm Optimization ◽

Selection Methods

In Microarray Data, it is complicated to achieve more classification accuracy due to the presence of high dimensions, irrelevant and noisy data. And also It had more gene expression data and fewer samples. To increase the classification accuracy and the processing speed of the model, an optimal number of features need to extract, this can be achieved by applying the feature selection method. In this paper, we propose a hybrid ensemble feature selection method. The proposed method has two phases, filter and wrapper phase in filter phase ensemble technique is used for aggregating the feature ranks of the Relief, minimum redundancy Maximum Relevance (mRMR), and Feature Correlation (FC) filter feature selection methods. This paper uses the Fuzzy Gaussian membership function ordering for aggregating the ranks. In wrapper phase, Improved Binary Particle Swarm Optimization (IBPSO) is used for selecting the optimal features, and the RBF Kernel-based Support Vector Machine (SVM) classifier is used as an evaluator. The performance of the proposed model are compared with state of art feature selection methods using five benchmark datasets. For evaluation various performance metrics such as Accuracy, Recall, Precision, and F1-Score are used. Furthermore, the experimental results show that the performance of the proposed method outperforms the other feature selection methods.

Download Full-text

PROTEOMIC BIOMARKER IDENTIFICATION FOR DIAGNOSIS OF EARLY RELAPSE IN OVARIAN CANCER

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720006002399 ◽

2006 ◽

Vol 04 (06) ◽

pp. 1159-1179 ◽

Cited By ~ 14

Author(s):

JUNG HUN OH ◽

ANIMESH NANDI ◽

PREM GURNANI ◽

LYNNE KNOWLES ◽

JOHN SCHORGE ◽

...

Keyword(s):

Ovarian Cancer ◽

Feature Selection ◽

Minimum Weight ◽

Early Recurrence ◽

Recursive Feature Elimination ◽

Support Vector ◽

Primary Therapy ◽

Serum Samples ◽

Markov Blanket ◽

Early Relapse

Ovarian cancer recurs at the rate of 75% within a few months or several years later after therapy. Early recurrence, though responding better to treatment, is difficult to detect. Surface-enhanced laser desorption/ionization time-of-flight (SELDI-TOF) mass spectrometry has showed the potential to accurately identify disease biomarkers to help early diagnosis. A major challenge in the interpretation of SELDI-TOF data is the high dimensionality of the feature space. To tackle this problem, we have developed a multi-step data processing method composed of t-test, binning and backward feature selection. A new algorithm, support vector machine-Markov blanket/recursive feature elimination (SVM-MB/RFE) is presented for the backward feature selection. This method is an integration of minimum weight feature elimination by SVM-RFE and information theory based redundant/irrelevant feature removal by Markov Blanket. Subsequently, SVM was used for classification. We conducted the biomarker selection algorithm on 113 serum samples to identify early relapse from ovarian cancer patients after primary therapy. To validate the performance of the proposed algorithm, experiments were carried out in comparison with several other feature selection and classification algorithms.

Download Full-text

Evaluation of Feature Selection Techniques in a Multifrequency Large Amplitude Pulse Voltammetric Electronic Tongue

Engineering Proceedings ◽

10.3390/ecsa-7-08242 ◽

2020 ◽

Vol 2 (1) ◽

pp. 62

Author(s):

Luis F. Villamil-Cubillos ◽

Jersson X. Leon-Medina ◽

Maribel Anaya ◽

Diego A. Tibaduiza

Keyword(s):

Feature Selection ◽

Large Amplitude ◽

Electronic Tongue ◽

Supervised Machine Learning ◽

Recursive Feature Elimination ◽

Support Vector ◽

Variance Filter ◽

Supervised Machine Learning Classifiers ◽

Voltammetric Electronic Tongue ◽

Feature Selection Techniques

An electronic tongue is a device composed of a sensor array that takes advantage of the cross sensitivity property of several sensors to perform classification and quantification in liquid substances. In practice, electronic tongues generate a large amount of information that needs to be correctly analyzed, to define which interactions and features are more relevant to distinguish one substance from another. This work focuses on implementing and validating feature selection methodologies in the liquid classification process of a multifrequency large amplitude pulse voltammetric (MLAPV) electronic tongue. Multi-layer perceptron neural network (MLP NN) and support vector machine (SVM) were used as supervised machine learning classifiers. Different feature selection techniques were used, such as Variance filter, ANOVA F-value, Recursive Feature Elimination and model-based selection. Both 5-fold Cross validation and GridSearchCV were used in order to evaluate the performance of the feature selection methodology by testing various configurations and determining the best one. The methodology was validated in an imbalanced MLAPV electronic tongue dataset of 13 different liquid substances, reaching a 93.85% of classification accuracy.

Download Full-text

The Evaluation of Accuracy Performance in an Enhanced Embedded Feature Selection for Unstructured Text Classification

Iraqi Journal of Science ◽

10.24996/ijs.2020.61.12.28 ◽

2020 ◽

pp. 3397-3407

Author(s):

Nur Syafiqah Mohd Nafis ◽

Suryanti Awang

Keyword(s):

Feature Selection ◽

Text Classification ◽

Training Dataset ◽

Recursive Feature Elimination ◽

High Dimensional ◽

Significant Feature ◽

Support Vector ◽

Svm Classifier ◽

Text Documents ◽

Text Document

Text documents are unstructured and high dimensional. Effective feature selection is required to select the most important and significant feature from the sparse feature space. Thus, this paper proposed an embedded feature selection technique based on Term Frequency-Inverse Document Frequency (TF-IDF) and Support Vector Machine-Recursive Feature Elimination (SVM-RFE) for unstructured and high dimensional text classificationhis technique has the ability to measure the feature’s importance in a high-dimensional text document. In addition, it aims to increase the efficiency of the feature selection. Hence, obtaining a promising text classification accuracy. TF-IDF act as a filter approach which measures features importance of the text documents at the first stage. SVM-RFE utilized a backward feature elimination scheme to recursively remove insignificant features from the filtered feature subsets at the second stage. This research executes sets of experiments using a text document retrieved from a benchmark repository comprising a collection of Twitter posts. Pre-processing processes are applied to extract relevant features. After that, the pre-processed features are divided into training and testing datasets. Next, feature selection is implemented on the training dataset by calculating the TF-IDF score for each feature. SVM-RFE is applied for feature ranking as the next feature selection step. Only top-rank features will be selected for text classification using the SVM classifier. Based on the experiments, it shows that the proposed technique able to achieve 98% accuracy that outperformed other existing techniques. In conclusion, the proposed technique able to select the significant features in the unstructured and high dimensional text document.

Download Full-text

A Machine Learning Approach to EEG-based Prediction of Human Affective States Using Recursive Feature Elimination Method

MATEC Web of Conferences ◽

10.1051/matecconf/202133504001 ◽

2021 ◽

Vol 335 ◽

pp. 04001

Author(s):

Didar Dadebayev ◽

Goh Wei Wei ◽

Tan Ee Xion

Keyword(s):

Feature Selection ◽

Fractal Dimension ◽

Emotion Recognition ◽

Affective Computing ◽

Feature Selection Method ◽

Recursive Feature Elimination ◽

Support Vector ◽

Emotional States ◽

Affective States ◽

Emotion Classification

Emotion recognition, as a branch of affective computing, has attracted great attention in the last decades as it can enable more natural brain-computer interface systems. Electroencephalography (EEG) has proven to be an effective modality for emotion recognition, with which user affective states can be tracked and recorded, especially for primitive emotional events such as arousal and valence. Although brain signals have been shown to correlate with emotional states, the effectiveness of proposed models is somewhat limited. The challenge is improving accuracy, while appropriate extraction of valuable features might be a key to success. This study proposes a framework based on incorporating fractal dimension features and recursive feature elimination approach to enhance the accuracy of EEG-based emotion recognition. The fractal dimension and spectrum-based features to be extracted and used for more accurate emotional state recognition. Recursive Feature Elimination will be used as a feature selection method, whereas the classification of emotions will be performed by the Support Vector Machine (SVM) algorithm. The proposed framework will be tested with a widely used public database, and results are expected to demonstrate higher accuracy and robustness compared to other studies. The contributions of this study are primarily about the improvement of the EEG-based emotion classification accuracy. There is a potential restriction of how generic the results can be as different EEG dataset might yield different results for the same framework. Therefore, experimenting with different EEG dataset and testing alternative feature selection schemes can be very interesting for future work.

Download Full-text

Hybrid adapted fast correlation FCBF-support vector machine recursive feature elimination for feature selection

Intelligent Decision Technologies ◽

10.3233/idt-190014 ◽

2020 ◽

Vol 14 (3) ◽

pp. 269-279

Author(s):

Hayet Djellali ◽

Nacira Ghoualmi-Zine ◽

Souad Guessoum

Keyword(s):

Support Vector Machine ◽

Feature Selection ◽

Recursive Feature Elimination ◽

Support Vector ◽

Svm Classifier ◽

Hybrid Architecture ◽

Features Selection ◽

K Nearest Neighbors ◽

Correlation Based Feature Selection ◽

Embedded Method

This paper investigates feature selection methods based on hybrid architecture using feature selection algorithm called Adapted Fast Correlation Based Feature selection and Support Vector Machine Recursive Feature Elimination (AFCBF-SVMRFE). The AFCBF-SVMRFE has three stages and composed of SVMRFE embedded method with Correlation based Features Selection. The first stage is the relevance analysis, the second one is a redundancy analysis, and the third stage is a performance evaluation and features restoration stage. Experiments show that the proposed method tested on different classifiers: Support Vector Machine SVM and K nearest neighbors KNN provide a best accuracy on various dataset. The SVM classifier outperforms KNN classifier on these data. The AFCBF-SVMRFE outperforms FCBF multivariate filter, SVMRFE, Particle swarm optimization PSO and Artificial bees colony ABC.

Download Full-text