scholarly journals AdCOFE: Advanced Contextual Feature Extraction in conversations for emotion classification

2021 ◽  
Vol 7 ◽  
pp. e786
Author(s):  
Vaibhav Bhat ◽  
Anita Yadav ◽  
Sonal Yadav ◽  
Dhivya Chandrasekaran ◽  
Vijay Mago

Emotion recognition in conversations is an important step in various virtual chatbots which require opinion-based feedback, like in social media threads, online support, and many more applications. Current emotion recognition in conversations models face issues like: (a) loss of contextual information in between two dialogues of a conversation, (b) failure to give appropriate importance to significant tokens in each utterance, (c) inability to pass on the emotional information from previous utterances. The proposed model of Advanced Contextual Feature Extraction (AdCOFE) addresses these issues by performing unique feature extraction using knowledge graphs, sentiment lexicons and phrases of natural language at all levels (word and position embedding) of the utterances. Experiments on emotion recognition in conversations datasets show that AdCOFE is beneficial in capturing emotions in conversations.

Author(s):  
Huimin Lu ◽  
Rui Yang ◽  
Zhenrong Deng ◽  
Yonglin Zhang ◽  
Guangwei Gao ◽  
...  

Chinese image description generation tasks usually have some challenges, such as single-feature extraction, lack of global information, and lack of detailed description of the image content. To address these limitations, we propose a fuzzy attention-based DenseNet-BiLSTM Chinese image captioning method in this article. In the proposed method, we first improve the densely connected network to extract features of the image at different scales and to enhance the model’s ability to capture the weak features. At the same time, a bidirectional LSTM is used as the decoder to enhance the use of context information. The introduction of an improved fuzzy attention mechanism effectively improves the problem of correspondence between image features and contextual information. We conduct experiments on the AI Challenger dataset to evaluate the performance of the model. The results show that compared with other models, our proposed model achieves higher scores in objective quantitative evaluation indicators, including BLEU , BLEU , METEOR, ROUGEl, and CIDEr. The generated description sentence can accurately express the image content.


2019 ◽  
Author(s):  
Jennifer Sorinas ◽  
Juan C. Fernandez-Troyano ◽  
Mikel Val-Calvo ◽  
Jose Manuel Ferrández ◽  
Eduardo Fernandez

ABSTRACTThe large range of potential applications, not only for patients but also for healthy people, that could be achieved by affective BCI (aBCI) makes more latent the necessity of finding a commonly accepted protocol for real-time EEG-based emotion recognition. Based on wavelet package for spectral feature extraction, attending to the nature of the EEG signal, we have specified some of the main parameters needed for the implementation of robust positive and negative emotion classification. 12 seconds has resulted as the most appropriate sliding window size; from that, a set of 20 target frequency-location variables have been proposed as the most relevant features that carry the emotional information. Lastly, QDA and KNN classifiers and population rating criterion for stimuli labeling have been suggested as the most suitable approaches for EEG-base emotion recognition. The proposed model reached a mean accuracy of 98% (s.d. 1.4) and 98.96% (s.d. 1.28) in a subject-dependent approach for QDA and KNN classifier, respectively. This new model represents a step forward towards real-time classification. Although results were not conclusive, new insights regarding subject-independent approximation have been discussed.


Computers ◽  
2020 ◽  
Vol 9 (4) ◽  
pp. 95
Author(s):  
Rania Alhalaseh ◽  
Suzan Alasasfeh

Many scientific studies have been concerned with building an automatic system to recognize emotions, and building such systems usually relies on brain signals. These studies have shown that brain signals can be used to classify many emotional states. This process is considered difficult, especially since the brain’s signals are not stable. Human emotions are generated as a result of reactions to different emotional states, which affect brain signals. Thus, the performance of emotion recognition systems by brain signals depends on the efficiency of the algorithms used to extract features, the feature selection algorithm, and the classification process. Recently, the study of electroencephalography (EEG) signaling has received much attention due to the availability of several standard databases, especially since brain signal recording devices have become available in the market, including wireless ones, at reasonable prices. This work aims to present an automated model for identifying emotions based on EEG signals. The proposed model focuses on creating an effective method that combines the basic stages of EEG signal handling and feature extraction. Different from previous studies, the main contribution of this work relies in using empirical mode decomposition/intrinsic mode functions (EMD/IMF) and variational mode decomposition (VMD) for signal processing purposes. Despite the fact that EMD/IMFs and VMD methods are widely used in biomedical and disease-related studies, they are not commonly utilized in emotion recognition. In other words, the methods used in the signal processing stage in this work are different from the methods used in literature. After the signal processing stage, namely in the feature extraction stage, two well-known technologies were used: entropy and Higuchi’s fractal dimension (HFD). Finally, in the classification stage, four classification methods were used—naïve Bayes, k-nearest neighbor (k-NN), convolutional neural network (CNN), and decision tree (DT)—for classifying emotional states. To evaluate the performance of our proposed model, experiments were applied to a common database called DEAP based on many evaluation models, including accuracy, specificity, and sensitivity. The experiments showed the efficiency of the proposed method; a 95.20% accuracy was achieved using the CNN-based method.


2021 ◽  
Vol 15 ◽  
Author(s):  
Jing Chen ◽  
Haifeng Li ◽  
Lin Ma ◽  
Hongjian Bo ◽  
Frank Soong ◽  
...  

Recently, emotion classification from electroencephalogram (EEG) data has attracted much attention. As EEG is an unsteady and rapidly changing voltage signal, the features extracted from EEG usually change dramatically, whereas emotion states change gradually. Most existing feature extraction approaches do not consider these differences between EEG and emotion. Microstate analysis could capture important spatio-temporal properties of EEG signals. At the same time, it could reduce the fast-changing EEG signals to a sequence of prototypical topographical maps. While microstate analysis has been widely used to study brain function, few studies have used this method to analyze how brain responds to emotional auditory stimuli. In this study, the authors proposed a novel feature extraction method based on EEG microstates for emotion recognition. Determining the optimal number of microstates automatically is a challenge for applying microstate analysis to emotion. This research proposed dual-threshold-based atomize and agglomerate hierarchical clustering (DTAAHC) to determine the optimal number of microstate classes automatically. By using the proposed method to model the temporal dynamics of auditory emotion process, we extracted microstate characteristics as novel temporospatial features to improve the performance of emotion recognition from EEG signals. We evaluated the proposed method on two datasets. For public music-evoked EEG Dataset for Emotion Analysis using Physiological signals, the microstate analysis identified 10 microstates which together explained around 86% of the data in global field power peaks. The accuracy of emotion recognition achieved 75.8% in valence and 77.1% in arousal using microstate sequence characteristics as features. Compared to previous studies, the proposed method outperformed the current feature sets. For the speech-evoked EEG dataset, the microstate analysis identified nine microstates which together explained around 85% of the data. The accuracy of emotion recognition achieved 74.2% in valence and 72.3% in arousal using microstate sequence characteristics as features. The experimental results indicated that microstate characteristics can effectively improve the performance of emotion recognition from EEG signals.


Author(s):  
Huiyun Zhang ◽  
Heming Huang ◽  
Henry Han

Speech emotion recognition remains a heavy lifting in natural language processing. It has strict requirements to the effectiveness of feature extraction and that of acoustic model. With that in mind, a Heterogeneous Parallel Convolution Bi-LSTM model is proposed to address these challenges. It consists of two heterogeneous branches: the left one contains two dense layers and a Bi-LSTM layer, while the right one contains a dense layer, a convolution layer, and a Bi-LSTM layer. It can exploit the spatiotemporal information more effectively, and achieves 84.65%, 79.67%, and 56.50% unweighted average recall on the benchmark databases EMODB, CASIA, and SAVEE, respectively. Compared with the previous research results, the proposed model achieves better performance stably.


Automatic speech emotion recognition is a very necessary activity for effective human-computer interaction. This paper is motivated by using spectrograms as inputs to the hybrid deep convolutional LSTM for speech emotion recognition. In this study, we trained our proposed model using four convolutional layers for high-level feature extraction from input spectrograms, LSTM layer for accumulating long-term dependencies and finally two dense layers. Experimental results on the SAVEE database shows promising performance. Our proposed model is highly capable as it obtained an accuracy of 94.26%.


2021 ◽  
Vol 11 (21) ◽  
pp. 9897
Author(s):  
Huiyun Zhang ◽  
Heming Huang ◽  
Henry Han

Speech emotion recognition is a substantial component of natural language processing (NLP). It has strict requirements for the effectiveness of feature extraction and that of the acoustic model. With that in mind, a Heterogeneous Parallel Convolution Bi-LSTM model is proposed to address the challenges. It consists of two heterogeneous branches: the left one contains two dense layers and a Bi-LSTM layer, while the right one contains a dense layer, a convolution layer, and a Bi-LSTM layer. It can exploit the spatiotemporal information more effectively, and achieves 84.65%, 79.67%, and 56.50% unweighted average recalls on the benchmark databases EMODB, CASIA, and SAVEE, respectively. Compared with the previous research results, the proposed model achieves better performance stably.


2020 ◽  
pp. 1-15
Author(s):  
Wang Wei ◽  
Xinyi Cao ◽  
He Li ◽  
Lingjie Shen ◽  
Yaqin Feng ◽  
...  

Abstract To improve speech emotion recognition, a U-acoustic words emotion dictionary (AWED) features model is proposed based on an AWED. The method models emotional information from acoustic words level in different emotion classes. The top-list words in each emotion are selected to generate the AWED vector. Then, the U-AWED model is constructed by combining utterance-level acoustic features with the AWED features. Support vector machine and convolutional neural network are employed as the classifiers in our experiment. The results show that our proposed method in four tasks of emotion classification all provides significant improvement in unweighted average recall.


2015 ◽  
Vol 781 ◽  
pp. 551-554 ◽  
Author(s):  
Chaidiaw Thiangtham ◽  
Jakkree Srinonchat

Speech Emotion Recognition has widely researched and applied to some appllication such as for communication with robot, E-learning system and emergency call etc.Speech emotion feature extraction is an importance key to achieve the speech emotion recognition which can be classify for personal identity. Speech emotion features are extracted into several coefficients such as Linear Predictive Coefficients (LPCs), Linear Spectral Frequency (LSF), Zero-Crossing (ZC), Mel-Frequency Cepstrum Coefficients (MFCC) [1-6] etc. There are some of research works which have been done in the speech emotion recgnition. A study of zero-crossing with peak-amplitudes in speech emotion classification is introduced in [4]. The results shown that it provides the the technique to extract the emotion feature in time-domain, which still got the problem in amplitude shifting. The emotion recognition from speech is descrpited in [5]. It used the Gaussian Mixture Model (GMM) for extractor of feature speech. The GMM is provided the good results to reduce the back ground noise, howere it still have to focus on random noise in GMM for recognition model. The speech emotion recognition using hidden markov model and support vector machine is explained in [6]. The results shown the average performance of recognition system according to the features of speech emotion still has got the error information. Thus [1-6] provides the recognition performance which still requiers more focus on speech features.


Author(s):  
Turker Tuncer ◽  
Sengul Dogan ◽  
Abdulhamit Subasi

AbstractElectroencephalography (EEG) signals collected from human brains have generally been used to diagnose diseases. Moreover, EEG signals can be used in several areas such as emotion recognition, driving fatigue detection. This work presents a new emotion recognition model by using EEG signals. The primary aim of this model is to present a highly accurate emotion recognition framework by using both a hand-crafted feature generation and a deep classifier. The presented framework uses a multilevel fused feature generation network. This network has three primary phases, which are tunable Q-factor wavelet transform (TQWT), statistical feature generation, and nonlinear textural feature generation phases. TQWT is applied to the EEG data for decomposing signals into different sub-bands and create a multilevel feature generation network. In the nonlinear feature generation, an S-box of the LED block cipher is utilized to create a pattern, which is named as Led-Pattern. Moreover, statistical feature extraction is processed using the widely used statistical moments. The proposed LED pattern and statistical feature extraction functions are applied to 18 TQWT sub-bands and an original EEG signal. Therefore, the proposed hand-crafted learning model is named LEDPatNet19. To select the most informative features, ReliefF and iterative Chi2 (RFIChi2) feature selector is deployed. The proposed model has been developed on the two EEG emotion datasets, which are GAMEEMO and DREAMER datasets. Our proposed hand-crafted learning network achieved 94.58%, 92.86%, and 94.44% classification accuracies for arousal, dominance, and valance cases of the DREAMER dataset. Furthermore, the best classification accuracy of the proposed model for the GAMEEMO dataset is equal to 99.29%. These results clearly illustrate the success of the proposed LEDPatNet19.


Sign in / Sign up

Export Citation Format

Share Document