A New Feature Selection Scheme for Emotion Recognition from Text

This paper presents a new scheme for term selection in the field of emotion recognition from text. The proposed framework is based on utilizing moderately frequent terms during term selection. More specifically, all terms are evaluated by considering their relevance scores, based on the idea that moderately frequent terms may carry valuable information for discrimination as well. The proposed feature selection scheme performs better than conventional filter-based feature selection measures Chi-Square and Gini-Text in numerous cases. The bag-of-words approach is used to construct the vectors for document representation where each selected term is assigned the weight 1 if it exists or assigned the weight 0 if it does not exist in the document. The proposed scheme includes the terms that are not selected by Chi-Square and Gini-Text. Experiments conducted on a benchmark dataset show that moderately frequent terms boost the representation power of the term subsets as noticeable improvements are observed in terms of Accuracies.

Download Full-text

A New Feature Selection Scheme Using a Data Distribution Factor for Unsupervised Nominal Data

IEEE Transactions on Systems Man and Cybernetics Part B (Cybernetics) ◽

10.1109/tsmcb.2007.914707 ◽

2008 ◽

Vol 38 (2) ◽

pp. 499-509 ◽

Cited By ~ 19

Author(s):

T.W.S. Chow ◽

Piyang Wang ◽

E.W.M. Ma

Keyword(s):

Feature Selection ◽

Data Distribution ◽

Distribution Factor ◽

Nominal Data ◽

Selection Scheme ◽

New Feature

Download Full-text

FIVE NEW FEATURE SELECTION METRICS IN TEXT CATEGORIZATION

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001407005831 ◽

2007 ◽

Vol 21 (06) ◽

pp. 1085-1101 ◽

Cited By ~ 5

Author(s):

FENGXI SONG ◽

DAVID ZHANG ◽

YONG XU ◽

JIZHONG WANG

Keyword(s):

Feature Selection ◽

Text Categorization ◽

Chi Square ◽

Document Frequency ◽

Feature Spaces ◽

Text Feature ◽

Statistical Pattern ◽

Individual Features ◽

New Feature ◽

Data Collections

Feature selection has been extensively applied in statistical pattern recognition as a mechanism for cleaning up the set of features that are used to represent data and as a way of improving the performance of classifiers. Four schemes commonly used for feature selection are Exponential Searches, Stochastic Searches, Sequential Searches, and Best Individual Features. The most popular scheme used in text categorization is Best Individual Features as the extremely high dimensionality of text feature spaces render the other three feature selection schemes time prohibitive. This paper proposes five new metrics for selecting Best Individual Features for use in text categorization. Their effectiveness have been empirically tested on two well- known data collections, Reuters-21578 and 20 Newsgroups. Experimental results show that the performance of two of the five new metrics, Bayesian Rule and F-one Value, is not significantly below that of a good traditional text categorization selection metric, Document Frequency. The performance of another two of these five new metrics, Low Loss Dimensionality Reduction and Relative Frequency Difference, is equal to or better than that of conventional good feature selection metrics such as Mutual Information and Chi-square Statistic.

Download Full-text

English Text Classification Using Improved Recursive Feature Elimination (IRFE) Algorithm: تصنيف النص الإنجليزي باستخدام الخوارزمية العودية المحسنة لإزالة الخواص (IRFE)

Journal of engineering sciences and information technology - مجلة العلوم الهندسية و تكنولوجيا المعلومات ◽

10.26389/ajsrp.r080420 ◽

2020 ◽

Vol 4 (2) ◽

Author(s):

Esraa H. Abd Al-Ameer, Ahmed H. Aliwy

Keyword(s):

Feature Selection ◽

Language Processing ◽

Text Classification ◽

Feature Selection Method ◽

Selection Method ◽

English Text ◽

Recursive Feature Elimination ◽

Chi Square ◽

Data Set ◽

New Feature

Documents classification is from most important fields for Natural language processing and text mining. There are many algorithms can be used for this task. In this paper, focuses on improving Text Classification by feature selection. This means determine some of the original features without affecting the accuracy of the work, where our work is a new feature selection method was suggested which can be a general formulation and mathematical model of Recursive Feature Elimination (RFE). The used method was compared with other two well-known feature selection methods: Chi-square and threshold. The results proved that the new method is comparable with the other methods, The best results were 83% when 60% of features used, 82% when 40% of features used, and 82% when 20% of features used. The tests were done with the Naïve Bayes (NB) and decision tree (DT) classification algorithms , where the used dataset is a well-known English data set “20 newsgroups text” consists of approximately 18846 files. The results showed that our suggested feature selection method is comparable with standard Like Chi-square.

Download Full-text

New feature selection based on kernel

Bulletin of Electrical Engineering and Informatics ◽

10.11591/eei.v9i4.1959 ◽

2020 ◽

Vol 9 (4) ◽

pp. 1569-1577

Author(s):

Zuherman Rustam ◽

Sri Hartini

Keyword(s):

Feature Selection ◽

Kernel Function ◽

Polynomial Kernel ◽

Data Repository ◽

Fisher Score ◽

Chi Square ◽

Running Time ◽

Kernel Parameter ◽

Real World Datasets ◽

New Feature

Feature selection is an essential issue in machine learning. It discards the unnecessary or redundant features in the dataset. This paper introduced the new feature selection based on kernel function using 16 the real-world datasets from UCI data repository, and k-means clustering was utilized as the classifier using radial basis function (RBF) and polynomial kernel function. After sorting the features using the new feature selection, 75 percent of it was examined and evaluated using 10-fold cross-validation, then the accuracy, F1-Score, and running time were compared. From the experiments, it was concluded that the performance of the new feature selection based on RBF kernel function varied according to the value of the kernel parameter, opposite with the polynomial kernel function. Moreover, the new feature selection based on RBF has a faster running time compared to the polynomial kernel function. Besides, the proposed method has higher accuracy and F1-Score until 40 percent difference in several datasets compared to the commonly used feature selection techniques such as Fisher score, Chi-Square test, and Laplacian score. Therefore, this method can be considered to use for feature selection

Download Full-text

New feature selection frameworks in emotion recognition to evaluate the informative power of speech related features

10.1109/isspa.2007.4555415 ◽

2007 ◽

Author(s):

H. Altun ◽

J. Shawe-Taylor ◽

G. Polat

Keyword(s):

Feature Selection ◽

Emotion Recognition ◽

New Feature

Download Full-text

Analysis of Feature Selection and Ensemble Classifier Methods for Intrusion Detection

International Journal of Natural Computing Research ◽

10.4018/ijncr.2018010104 ◽

2018 ◽

Vol 7 (1) ◽

pp. 57-72

Author(s):

H.P. Vinutha ◽

Poornima Basavaraju

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Detection Rate ◽

Information Gain ◽

False Positive Rate ◽

Ensemble Classifier ◽

Ensemble Classification ◽

Chi Square ◽

Traffic Pattern ◽

Data Mining Algorithms

Day by day network security is becoming more challenging task. Intrusion detection systems (IDSs) are one of the methods used to monitor the network activities. Data mining algorithms play a major role in the field of IDS. NSL-KDD'99 dataset is used to study the network traffic pattern which helps us to identify possible attacks takes place on the network. The dataset contains 41 attributes and one class attribute categorized as normal, DoS, Probe, R2L and U2R. In proposed methodology, it is necessary to reduce the false positive rate and improve the detection rate by reducing the dimensionality of the dataset, use of all 41 attributes in detection technology is not good practices. Four different feature selection methods like Chi-Square, SU, Gain Ratio and Information Gain feature are used to evaluate the attributes and unimportant features are removed to reduce the dimension of the data. Ensemble classification techniques like Boosting, Bagging, Stacking and Voting are used to observe the detection rate separately with three base algorithms called Decision stump, J48 and Random forest.

Download Full-text

A lazy feature selection method for multi-label classification

Intelligent Data Analysis ◽

10.3233/ida-194878 ◽

2021 ◽

Vol 25 (1) ◽

pp. 21-34

Author(s):

Rafael B. Pereira ◽

Alexandre Plastino ◽

Bianca Zadrozny ◽

Luiz H.C. Merschmann

Keyword(s):

Feature Selection ◽

Text Categorization ◽

Feature Selection Method ◽

Selection Method ◽

Video Classification ◽

Classification Problems ◽

Class Label ◽

New Feature ◽

Feature Selection Techniques ◽

Biomolecular Analysis

In many important application domains, such as text categorization, biomolecular analysis, scene or video classification and medical diagnosis, instances are naturally associated with more than one class label, giving rise to multi-label classification problems. This has led, in recent years, to a substantial amount of research in multi-label classification. More specifically, feature selection methods have been developed to allow the identification of relevant and informative features for multi-label classification. This work presents a new feature selection method based on the lazy feature selection paradigm and specific for the multi-label context. Experimental results show that the proposed technique is competitive when compared to multi-label feature selection techniques currently used in the literature, and is clearly more scalable, in a scenario where there is an increasing amount of data.

Download Full-text

Research on the Emotion Recognition based on ReliefF Matching Feature Selection Method

2020 5th International Conference on Mechanical, Control and Computer Engineering (ICMCCE) ◽

10.1109/icmcce51767.2020.00333 ◽

2020 ◽

Author(s):

Zhang xiao-dan ◽

Li Tao ◽

She yi-chong ◽

Zhao Rui

Keyword(s):

Feature Selection ◽

Emotion Recognition ◽

Feature Selection Method ◽

Selection Method

Download Full-text

Acting Surprised: Comparing Perceptions of Different Dynamic Deliberate Expressions

Journal of Nonverbal Behavior ◽

10.1007/s10919-020-00349-9 ◽

2020 ◽

Author(s):

Mircea Zloteanu ◽

Eva G. Krumhuber ◽

Daniel C. Richardson

Keyword(s):

Emotion Recognition ◽

Facial Expressions ◽

Affective State ◽

External Condition ◽

Internal Condition ◽

Minimal Effort ◽

Better Than

AbstractPeople are accurate at classifying emotions from facial expressions but much poorer at determining if such expressions are spontaneously felt or deliberately posed. We explored if the method used by senders to produce an expression influences the decoder’s ability to discriminate authenticity, drawing inspiration from two well-known acting techniques: the Stanislavski (internal) and Mimic method (external). We compared spontaneous surprise expressions in response to a jack-in-the-box (genuine condition), to posed displays of senders who either focused on their past affective state (internal condition) or the outward expression (external condition). Although decoders performed better than chance at discriminating the authenticity of all expressions, their accuracy was lower in classifying external surprise compared to internal surprise. Decoders also found it harder to discriminate external surprise from spontaneous surprise and were less confident in their decisions, perceiving these to be similarly intense but less genuine-looking. The findings suggest that senders are capable of voluntarily producing genuine-looking expressions of emotions with minimal effort, especially by mimicking a genuine expression. Implications for research on emotion recognition are discussed.

Download Full-text

An Adaptive Unsupervised Feature Selection Algorithm Based on MDS for Tumor Gene Data Classification

Sensors ◽

10.3390/s21113627 ◽

2021 ◽

Vol 21 (11) ◽

pp. 3627

Author(s):

Bo Jin ◽

Chunling Fu ◽

Yong Jin ◽

Wei Yang ◽

Shengbin Li ◽

...

Keyword(s):

Feature Selection ◽

Local Structure ◽

Gene Selection ◽

Dimensional Space ◽

Original Data ◽

Global Structure ◽

Biological Data ◽

Special Treatment ◽

Selection Scheme ◽

Unsupervised Feature Selection

Identifying the key genes related to tumors from gene expression data with a large number of features is important for the accurate classification of tumors and to make special treatment decisions. In recent years, unsupervised feature selection algorithms have attracted considerable attention in the field of gene selection as they can find the most discriminating subsets of genes, namely the potential information in biological data. Recent research also shows that maintaining the important structure of data is necessary for gene selection. However, most current feature selection methods merely capture the local structure of the original data while ignoring the importance of the global structure of the original data. We believe that the global structure and local structure of the original data are equally important, and so the selected genes should maintain the essential structure of the original data as far as possible. In this paper, we propose a new, adaptive, unsupervised feature selection scheme which not only reconstructs high-dimensional data into a low-dimensional space with the constraint of feature distance invariance but also employs ℓ2,1-norm to enable a matrix with the ability to perform gene selection embedding into the local manifold structure-learning framework. Moreover, an effective algorithm is developed to solve the optimization problem based on the proposed scheme. Comparative experiments with some classical schemes on real tumor datasets demonstrate the effectiveness of the proposed method.

Download Full-text