Optimal Combination of Multivariate Filter Feature Selection and Classifier for Speech-Based Depression Detection

2021 ◽  
pp. 134-146
Author(s):  
Surbhi Sharma ◽  
Anthony J. Bustamante

In this paper, we have focused to improve the performance of a speech-based uni-modal depression detection system, which is non-invasive, involves low cost and computation time in comparison to multi-modal systems. The performance of a decision system mainly depends on the choice of feature selection method and the classifier. We have investigated the combination of four well-known multivariate filter methods (minimum Redundancy Maximum Relevance, Scatter Ratio, Mahalanobis Distance, Fast Correlation Based feature selection) and four well-known classifiers (k-Nearest Neighbour, Linear Discriminant classifier, Decision Tree, Support Vector Machine) to obtain a minimal set of relevant and non-redundant features to improve the performance. This will speed up the acquisition of features from speech and build the decision system with low cost and complexity. Experimental results on the high and low-level features of recent work on the DAICWOZ dataset demonstrate the superior performance of the combination of Scatter Ratio and LDC as well as that of Mahalanobis Distance and LDC, in comparison to other combinations and existing speech-based depression results, for both gender independent and gender-based studies. Further, these combinations have also outperformed a few multimodal systems. It was noted that low-level features are more discriminatory and provide a better f1 score.

2021 ◽  
Vol 11 ◽  
Author(s):  
Qi Wan ◽  
Jiaxuan Zhou ◽  
Xiaoying Xia ◽  
Jianfeng Hu ◽  
Peng Wang ◽  
...  

ObjectiveTo evaluate the performance of 2D and 3D radiomics features with different machine learning approaches to classify SPLs based on magnetic resonance(MR) T2 weighted imaging (T2WI).Material and MethodsA total of 132 patients with pathologically confirmed SPLs were examined and randomly divided into training (n = 92) and test datasets (n = 40). A total of 1692 3D and 1231 2D radiomics features per patient were extracted. Both radiomics features and clinical data were evaluated. A total of 1260 classification models, comprising 3 normalization methods, 2 dimension reduction algorithms, 3 feature selection methods, and 10 classifiers with 7 different feature numbers (confined to 3–9), were compared. The ten-fold cross-validation on the training dataset was applied to choose the candidate final model. The area under the receiver operating characteristic curve (AUC), precision-recall plot, and Matthews Correlation Coefficient were used to evaluate the performance of machine learning approaches.ResultsThe 3D features were significantly superior to 2D features, showing much more machine learning combinations with AUC greater than 0.7 in both validation and test groups (129 vs. 11). The feature selection method Analysis of Variance(ANOVA), Recursive Feature Elimination(RFE) and the classifier Logistic Regression(LR), Linear Discriminant Analysis(LDA), Support Vector Machine(SVM), Gaussian Process(GP) had relatively better performance. The best performance of 3D radiomics features in the test dataset (AUC = 0.824, AUC-PR = 0.927, MCC = 0.514) was higher than that of 2D features (AUC = 0.740, AUC-PR = 0.846, MCC = 0.404). The joint 3D and 2D features (AUC=0.813, AUC-PR = 0.926, MCC = 0.563) showed similar results as 3D features. Incorporating clinical features with 3D and 2D radiomics features slightly improved the AUC to 0.836 (AUC-PR = 0.918, MCC = 0.620) and 0.780 (AUC-PR = 0.900, MCC = 0.574), respectively.ConclusionsAfter algorithm optimization, 2D feature-based radiomics models yield favorable results in differentiating malignant and benign SPLs, but 3D features are still preferred because of the availability of more machine learning algorithmic combinations with better performance. Feature selection methods ANOVA and RFE, and classifier LR, LDA, SVM and GP are more likely to demonstrate better diagnostic performance for 3D features in the current study.


2013 ◽  
Vol 2013 ◽  
pp. 1-10 ◽  
Author(s):  
M. A. Duarte-Mermoud ◽  
N. H. Beltrán ◽  
S. A. Salah

Recently, a new crossover technique for genetic algorithms has been proposed. The technique, called probabilistic adaptive crossover (PAX), includes the estimation of the probability distribution of the population, storing the information regarding the best and the worst solutions of the problem being solved in a probability vector. The use of the proposed technique to face Chilean wine classification based on chromatograms obtained from an HPLC is reported in this paper. PAX is used in the first stage as the feature selection method and then support vector machines (SVM) and linear discriminant analysis (LDA) are used as classifiers. The results are compared with those obtained using the uniform (discrete) crossover standard technique and a variant of PAX called mixed crossover.


Author(s):  
Emad Kaen and Abdullah Algarni Emad Kaen and Abdullah Algarni

We recently noticed the advancement and growth in the field of artificial intelligence and in its various branches such as Machine Learning (ML) and Deep Learning in various vital fields such as robotics, smart cars, smart cities, health care, software engineering and many other fields. Software bug prediction are one of the most important ML uses in software engineering. In addition, the feature selection is one of ML methods that aim to reduce a feature set that are used for building models. In this paper, we propose to use the Chi-Square feature selection method to calculate features importance, then to build a ML models, first by using top ten important features and second by using top five important features, based on three of well-known ML classifications algorithms, Support Vector Machine, Naïve Bayes and Linear Discriminant Analysis, with adding and exploring more about the effeteness of new metric of code smell intensity, the performance results of our approach against baseline achieved an improvements as average accuracy among nine datasets reaching up to 5.12%, 4.15% and 1% on the NB, SVM and LDA classifiers respectively.


Author(s):  
B. Venkatesh ◽  
J. Anuradha

In Microarray Data, it is complicated to achieve more classification accuracy due to the presence of high dimensions, irrelevant and noisy data. And also It had more gene expression data and fewer samples. To increase the classification accuracy and the processing speed of the model, an optimal number of features need to extract, this can be achieved by applying the feature selection method. In this paper, we propose a hybrid ensemble feature selection method. The proposed method has two phases, filter and wrapper phase in filter phase ensemble technique is used for aggregating the feature ranks of the Relief, minimum redundancy Maximum Relevance (mRMR), and Feature Correlation (FC) filter feature selection methods. This paper uses the Fuzzy Gaussian membership function ordering for aggregating the ranks. In wrapper phase, Improved Binary Particle Swarm Optimization (IBPSO) is used for selecting the optimal features, and the RBF Kernel-based Support Vector Machine (SVM) classifier is used as an evaluator. The performance of the proposed model are compared with state of art feature selection methods using five benchmark datasets. For evaluation various performance metrics such as Accuracy, Recall, Precision, and F1-Score are used. Furthermore, the experimental results show that the performance of the proposed method outperforms the other feature selection methods.


2012 ◽  
Vol 532-533 ◽  
pp. 1191-1195 ◽  
Author(s):  
Zhen Yan Liu ◽  
Wei Ping Wang ◽  
Yong Wang

This paper introduces the design of a text categorization system based on Support Vector Machine (SVM). It analyzes the high dimensional characteristic of text data, the reason why SVM is suitable for text categorization. According to system data flow this system is constructed. This system consists of three subsystems which are text representation, classifier training and text classification. The core of this system is the classifier training, but text representation directly influences the currency of classifier and the performance of the system. Text feature vector space can be built by different kinds of feature selection and feature extraction methods. No research can indicate which one is the best method, so many feature selection and feature extraction methods are all developed in this system. For a specific classification task every feature selection method and every feature extraction method will be tested, and then a set of the best methods will be adopted.


Author(s):  
Gang Liu ◽  
Chunlei Yang ◽  
Sen Liu ◽  
Chunbao Xiao ◽  
Bin Song

A feature selection method based on mutual information and support vector machine (SVM) is proposed in order to eliminate redundant feature and improve classification accuracy. First, local correlation between features and overall correlation is calculated by mutual information. The correlation reflects the information inclusion relationship between features, so the features are evaluated and redundant features are eliminated with analyzing the correlation. Subsequently, the concept of mean impact value (MIV) is defined and the influence degree of input variables on output variables for SVM network based on MIV is calculated. The importance weights of the features described with MIV are sorted by descending order. Finally, the SVM classifier is used to implement feature selection according to the classification accuracy of feature combination which takes MIV order of feature as a reference. The simulation experiments are carried out with three standard data sets of UCI, and the results show that this method can not only effectively reduce the feature dimension and high classification accuracy, but also ensure good robustness.


2021 ◽  
Vol 335 ◽  
pp. 04001
Author(s):  
Didar Dadebayev ◽  
Goh Wei Wei ◽  
Tan Ee Xion

Emotion recognition, as a branch of affective computing, has attracted great attention in the last decades as it can enable more natural brain-computer interface systems. Electroencephalography (EEG) has proven to be an effective modality for emotion recognition, with which user affective states can be tracked and recorded, especially for primitive emotional events such as arousal and valence. Although brain signals have been shown to correlate with emotional states, the effectiveness of proposed models is somewhat limited. The challenge is improving accuracy, while appropriate extraction of valuable features might be a key to success. This study proposes a framework based on incorporating fractal dimension features and recursive feature elimination approach to enhance the accuracy of EEG-based emotion recognition. The fractal dimension and spectrum-based features to be extracted and used for more accurate emotional state recognition. Recursive Feature Elimination will be used as a feature selection method, whereas the classification of emotions will be performed by the Support Vector Machine (SVM) algorithm. The proposed framework will be tested with a widely used public database, and results are expected to demonstrate higher accuracy and robustness compared to other studies. The contributions of this study are primarily about the improvement of the EEG-based emotion classification accuracy. There is a potential restriction of how generic the results can be as different EEG dataset might yield different results for the same framework. Therefore, experimenting with different EEG dataset and testing alternative feature selection schemes can be very interesting for future work.


Author(s):  
ShuRui Li ◽  
Jing Jin ◽  
Ian Daly ◽  
Chang Liu ◽  
Andrzej Cichocki

Abstract Brain–computer interface (BCI) systems decode electroencephalogram signals to establish a channel for direct interaction between the human brain and the external world without the need for muscle or nerve control. The P300 speller, one of the most widely used BCI applications, presents a selection of characters to the user and performs character recognition by identifying P300 event-related potentials from the EEG. Such P300-based BCI systems can reach good levels of accuracy but are difficult to use in day-to-day life due to redundancy and noisy signal. A room for improvement should be considered. We propose a novel hybrid feature selection method for the P300-based BCI system to address the problem of feature redundancy, which combines the Menger curvature and linear discriminant analysis. First, selected strategies are applied separately to a given dataset to estimate the gain for application to each feature. Then, each generated value set is ranked in descending order and judged by a predefined criterion to be suitable in classification models. The intersection of the two approaches is then evaluated to identify an optimal feature subset. The proposed method is evaluated using three public datasets, i.e., BCI Competition III dataset II, BNCI Horizon dataset, and EPFL dataset. Experimental results indicate that compared with other typical feature selection and classification methods, our proposed method has better or comparable performance. Additionally, our proposed method can achieve the best classification accuracy after all epochs in three datasets. In summary, our proposed method provides a new way to enhance the performance of the P300-based BCI speller.


Sign in / Sign up

Export Citation Format

Share Document