scholarly journals Deep Learning Method for Selecting Effective Models and Feature Groups in Emotion Recognition Using an Asian Multimodal Database

Electronics ◽  
2020 ◽  
Vol 9 (12) ◽  
pp. 1988
Author(s):  
Jun-Ho Maeng ◽  
Dong-Hyun Kang ◽  
Deok-Hwan Kim

Emotional awareness is vital for advanced interactions between humans and computer systems. This paper introduces a new multimodal dataset called MERTI-Apps based on Asian physiological signals and proposes a genetic algorithm (GA)—long short-term memory (LSTM) deep learning model to derive the active feature groups for emotion recognition. This study developed an annotation labeling program for observers to tag the emotions of subjects by their arousal and valence during dataset creation. In the learning phase, a GA was used to select effective LSTM model parameters and determine the active feature group from 37 features and 25 brain lateralization features extracted from the electroencephalogram (EEG) time, frequency, and time–frequency domains. The proposed model achieved a root-mean-square error (RMSE) of 0.0156 in terms of the valence regression performance in the MAHNOB-HCI dataset, and RMSE performances of 0.0579 and 0.0287 in terms of valence and arousal regression performance, and 65.7% and 88.3% in terms of valence and arousal accuracy in the in-house MERTI-Apps dataset, which uses Asian-population-specific 12-channel EEG data and adds an additional brain lateralization (BL) feature. The results revealed 91.3% and 94.8% accuracy in the valence and arousal domain in the DEAP dataset owing to the effective model selection of a GA.

2019 ◽  
Vol 6 (1) ◽  
Author(s):  
Jianzhuo Yan ◽  
Shangbin Chen ◽  
Sinuo Deng

Abstract As an advanced function of the human brain, emotion has a significant influence on human studies, works, and other aspects of life. Artificial Intelligence has played an important role in recognizing human emotion correctly. EEG-based emotion recognition (ER), one application of Brain Computer Interface (BCI), is becoming more popular in recent years. However, due to the ambiguity of human emotions and the complexity of EEG signals, the EEG-ER system which can recognize emotions with high accuracy is not easy to achieve. Based on the time scale, this paper chooses the recurrent neural network as the breakthrough point of the screening model. According to the rhythmic characteristics and temporal memory characteristics of EEG, this research proposes a Rhythmic Time EEG Emotion Recognition Model (RT-ERM) based on the valence and arousal of Long–Short-Term Memory Network (LSTM). By applying this model, the classification results of different rhythms and time scales are different. The optimal rhythm and time scale of the RT-ERM model are obtained through the results of the classification accuracy of different rhythms and different time scales. Then, the classification of emotional EEG is carried out by the best time scales corresponding to different rhythms. Finally, by comparing with other existing emotional EEG classification methods, it is found that the rhythm and time scale of the model can contribute to the accuracy of RT-ERM.


Sensors ◽  
2022 ◽  
Vol 22 (1) ◽  
pp. 403
Author(s):  
Yajurv Bhatia ◽  
ASM Hossain Bari ◽  
Gee-Sern Jison Hsu ◽  
Marina Gavrilova

Motion capture sensor-based gait emotion recognition is an emerging sub-domain of human emotion recognition. Its applications span a variety of fields including smart home design, border security, robotics, virtual reality, and gaming. In recent years, several deep learning-based approaches have been successful in solving the Gait Emotion Recognition (GER) problem. However, a vast majority of such methods rely on Deep Neural Networks (DNNs) with a significant number of model parameters, which lead to model overfitting as well as increased inference time. This paper contributes to the domain of knowledge by proposing a new lightweight bi-modular architecture with handcrafted features that is trained using a RMSprop optimizer and stratified data shuffling. The method is highly effective in correctly inferring human emotions from gait, achieving a micro-mean average precision of 0.97 on the Edinburgh Locomotive Mocap Dataset. It outperforms all recent deep-learning methods, while having the lowest inference time of 16.3 milliseconds per gait sample. This research study is beneficial to applications spanning various fields, such as emotionally aware assistive robotics, adaptive therapy and rehabilitation, and surveillance.


2020 ◽  
Vol 14 ◽  
Author(s):  
Yaqing Zhang ◽  
Jinling Chen ◽  
Jen Hong Tan ◽  
Yuxuan Chen ◽  
Yunyi Chen ◽  
...  

Emotion is the human brain reacting to objective things. In real life, human emotions are complex and changeable, so research into emotion recognition is of great significance in real life applications. Recently, many deep learning and machine learning methods have been widely applied in emotion recognition based on EEG signals. However, the traditional machine learning method has a major disadvantage in that the feature extraction process is usually cumbersome, which relies heavily on human experts. Then, end-to-end deep learning methods emerged as an effective method to address this disadvantage with the help of raw signal features and time-frequency spectrums. Here, we investigated the application of several deep learning models to the research field of EEG-based emotion recognition, including deep neural networks (DNN), convolutional neural networks (CNN), long short-term memory (LSTM), and a hybrid model of CNN and LSTM (CNN-LSTM). The experiments were carried on the well-known DEAP dataset. Experimental results show that the CNN and CNN-LSTM models had high classification performance in EEG-based emotion recognition, and their accurate extraction rate of RAW data reached 90.12 and 94.17%, respectively. The performance of the DNN model was not as accurate as other models, but the training speed was fast. The LSTM model was not as stable as the CNN and CNN-LSTM models. Moreover, with the same number of parameters, the training speed of the LSTM was much slower and it was difficult to achieve convergence. Additional parameter comparison experiments with other models, including epoch, learning rate, and dropout probability, were also conducted in the paper. Comparison results prove that the DNN model converged to optimal with fewer epochs and a higher learning rate. In contrast, the CNN model needed more epochs to learn. As for dropout probability, reducing the parameters by ~50% each time was appropriate.


2021 ◽  
Vol 12 ◽  
Author(s):  
Xiang Chen ◽  
Rubing Huang ◽  
Xin Li ◽  
Lei Xiao ◽  
Ming Zhou ◽  
...  

Emotional design is an important development trend of interaction design. Emotional design in products plays a key role in enhancing user experience and inducing user emotional resonance. In recent years, based on the user's emotional experience, the design concept of strengthening product emotional design has become a new direction for most designers to improve their design thinking. In the emotional interaction design, the machine needs to capture the user's key information in real time, recognize the user's emotional state, and use a variety of clues to finally determine the appropriate user model. Based on this background, this research uses a deep learning mechanism for more accurate and effective emotion recognition, thereby optimizing the design of the interactive system and improving the user experience. First of all, this research discusses how to use user characteristics such as speech, facial expression, video, heartbeat, etc., to make machines more accurately recognize human emotions. Through the analysis of various characteristics, the speech is selected as the experimental material. Second, a speech-based emotion recognition method is proposed. The mel-Frequency cepstral coefficient (MFCC) of the speech signal is used as the input of the improved long and short-term memory network (ILSTM). To ensure the integrity of the information and the accuracy of the output at the next moment, ILSTM makes peephole connections in the forget gate and input gate of LSTM, and adds the unit state as input data to the threshold layer. The emotional features obtained by ILSTM are input into the attention layer, and the self-attention mechanism is used to calculate the weight of each frame of speech signal. The speech features with higher weights are used to distinguish different emotions and complete the emotion recognition of the speech signal. Experiments on the EMO-DB and CASIA datasets verify the effectiveness of the model for emotion recognition. Finally, the feasibility of emotional interaction system design is discussed.


2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Peng Lu ◽  
Yabin Zhang ◽  
Bing Zhou ◽  
Hongpo Zhang ◽  
Liwei Chen ◽  
...  

In recent years, deep learning (DNN) based methods have made leapfrogging level breakthroughs in detecting cardiac arrhythmias as the cost effectiveness of arithmetic power, and data size has broken through the tipping point. However, the inability of these methods to provide a basis for modeling decisions limits clinicians’ confidence on such methods. In this paper, a Gate Recurrent Unit (GRU) and decision tree fusion model, referred to as (T-GRU), was designed to explore the problem of arrhythmia recognition and to improve the credibility of deep learning methods. The fusion model multipathway processing time-frequency domain featured the introduction of decision tree probability analysis of frequency domain features, the regularization of GRU model parameters and weight control to improve the decision tree model output weights. The MIT-BIH arrhythmia database was used for validation. Results showed that the low-frequency band features dominated the model prediction. The fusion model had an accuracy of 98.31%, sensitivity of 96.85%, specificity of 98.81%, and precision of 96.73%, indicating its high reliability and clinical significance.


Sensors ◽  
2020 ◽  
Vol 20 (23) ◽  
pp. 6719
Author(s):  
Longbin Jin ◽  
Eun Yi Kim

Electroencephalogram (EEG)-based emotion recognition is receiving significant attention in research on brain-computer interfaces (BCI) and health care. To recognize cross-subject emotion from EEG data accurately, a technique capable of finding an effective representation robust to the subject-specific variability associated with EEG data collection processes is necessary. In this paper, a new method to predict cross-subject emotion using time-series analysis and spatial correlation is proposed. To represent the spatial connectivity between brain regions, a channel-wise feature is proposed, which can effectively handle the correlation between all channels. The channel-wise feature is defined by a symmetric matrix, the elements of which are calculated by the Pearson correlation coefficient between two-pair channels capable of complementarily handling subject-specific variability. The channel-wise features are then fed to two-layer stacked long short-term memory (LSTM), which can extract temporal features and learn an emotional model. Extensive experiments on two publicly available datasets, the Dataset for Emotion Analysis using Physiological Signals (DEAP) and the SJTU (Shanghai Jiao Tong University) Emotion EEG Dataset (SEED), demonstrate the effectiveness of the combined use of channel-wise features and LSTM. Experimental results achieve state-of-the-art classification rates of 98.93% and 99.10% during the two-class classification of valence and arousal in DEAP, respectively, with an accuracy of 99.63% during three-class classification in SEED.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Yu Wang

To implement a mature music composition model for Chinese users, this paper analyzes the music composition and emotion recognition of composition content through big data technology and Neural Network (NN) algorithm. First, through a brief analysis of the current music composition style, a new Music Composition Neural Network (MCNN) structure is proposed, which adjusts the probability distribution of the Long Short-Term Memory (LSTM) generation network by constructing a reasonable Reward function. Meanwhile, the rules of music theory are used to restrict the generation of music style and realize the intelligent generation of specific style music. Afterward, the generated music composition signal is analyzed from the time-frequency domain, frequency domain, nonlinearity, and time domain. Finally, the emotion feature recognition and extraction of music composition content are realized. Experiments show that: when the iteration times of the function increase, the number of weight parameter adjustments and learning ability will increase, and thus the accuracy of the model for music composition can be greatly improved. Meanwhile, when the iteration times increases, the loss function will decrease slowly. Moreover, the music composition generated through the proposed model includes the following four aspects: sadness, joy, loneliness, and relaxation. The research results can promote music composition intellectualization and impacts traditional music composition mode.


2022 ◽  
Vol 12 ◽  
Author(s):  
Xiaofeng Lu

This exploration aims to study the emotion recognition of speech and graphic visualization of expressions of learners under the intelligent learning environment of the Internet. After comparing the performance of several neural network algorithms related to deep learning, an improved convolution neural network-Bi-directional Long Short-Term Memory (CNN-BiLSTM) algorithm is proposed, and a simulation experiment is conducted to verify the performance of this algorithm. The experimental results indicate that the Accuracy of CNN-BiLSTM algorithm reported here reaches 98.75%, which is at least 3.15% higher than that of other algorithms. Besides, the Recall is at least 7.13% higher than that of other algorithms, and the recognition rate is not less than 90%. Evidently, the improved CNN-BiLSTM algorithm can achieve good recognition results, and provide significant experimental reference for research on learners’ emotion recognition and graphic visualization of expressions in an intelligent learning environment.


2019 ◽  
Vol 3 (2) ◽  
pp. 41 ◽  
Author(s):  
Sirwan Tofiq Jaafar ◽  
Mokhtar Mohammadi

An epileptic seizure is a sign of abnormal activity in the human brain. Electroencephalogram (EEG) is a standard tool that has been used vastly for detection of seizure activities. Many methods have been developed to help the neurophysiologists to detect the seizure activities with high accuracy. Most of them rely on the features extracted in the time, frequency, or time-frequency domains. The performance of the proposed methods is related to the performance of the features extracted from EEG recordings. Deep neural networks enable learning directly on the data without the domain knowledge needed to construct a feature set. This approach has been hugely successful in almost all machine learning applications. We propose a new framework that also learns directly from the data, without extracting a feature set. We proposed an original deep-learning-based method to classify EEG recordings. The EEG signal is segmented into 4 s segments and used to train the long- and short-term memory network. The trained model is used to discriminate the EEG seizure from the background. The Freiburg EEG dataset is used to assess the performance of the classifier. The 5-fold cross-validation is selected for evaluating the performance of the proposed method. About 97.75% of the accuracy is achieved.


Sign in / Sign up

Export Citation Format

Share Document