Time-Frequency Representation Learning with Graph Convolutional Network for Dialogue-Level Speech Emotion Recognition

Time-Frequency Deep Representation Learning for Speech Emotion Recognition Integrating Self-attention

Communications in Computer and Information Science - Neural Information Processing ◽

10.1007/978-3-030-36808-1_74 ◽

2019 ◽

pp. 681-689

Author(s):

Jiaxing Liu ◽

Zhilei Liu ◽

Longbiao Wang ◽

Lili Guo ◽

Jianwu Dang

Keyword(s):

Emotion Recognition ◽

Representation Learning ◽

Speech Emotion Recognition ◽

Time Frequency

Download Full-text

Adieu recurrence? End-to-end speech emotion recognition using a context stacking dilated convolutional network

2020 28th European Signal Processing Conference (EUSIPCO) ◽

10.23919/eusipco47968.2020.9287667 ◽

2021 ◽

Author(s):

Duowei Tang ◽

Peter Kuppens ◽

Luc Geurts ◽

Toon van Waterschoot

Keyword(s):

Emotion Recognition ◽

Speech Emotion Recognition ◽

Convolutional Network ◽

End To End

Download Full-text

Representation Learning with Spectro-Temporal-Channel Attention for Speech Emotion Recognition

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp39728.2021.9414006 ◽

2021 ◽

Author(s):

Lili Guo ◽

Longbiao Wang ◽

Chenglin Xu ◽

Jianwu Dang ◽

Eng Siong Chng ◽

...

Keyword(s):

Emotion Recognition ◽

Representation Learning ◽

Speech Emotion Recognition

Download Full-text

Speech emotion recognition based on data enhancement in time-frequency domain

International Symposium on Artificial Intelligence and Robotics 2020 ◽

10.1117/12.2579205 ◽

2020 ◽

Author(s):

QIANQIAN LI ◽

Fuji Ren ◽

Xiaoyan Shen ◽

Xin Kang

Keyword(s):

Emotion Recognition ◽

Frequency Domain ◽

Speech Emotion Recognition ◽

Time Frequency

Download Full-text

An Attention Pooling Based Representation Learning Method for Speech Emotion Recognition

10.21437/interspeech.2018-1242 ◽

2018 ◽

Cited By ~ 25

Author(s):

Pengcheng Li ◽

Yan Song ◽

Ian McLoughlin ◽

Wu Guo ◽

Lirong Dai

Keyword(s):

Emotion Recognition ◽

Representation Learning ◽

Speech Emotion Recognition ◽

Learning Method

Download Full-text

EEG-Based Emotion Recognition Using Quadratic Time-Frequency Distribution

Sensors ◽

10.3390/s18082739 ◽

2018 ◽

Vol 18 (8) ◽

pp. 2739 ◽

Cited By ~ 22

Author(s):

Rami Alazrai ◽

Rasha Homoud ◽

Hisham Alwanni ◽

Mohammad Daoud

Keyword(s):

Emotion Recognition ◽

Frequency Domain ◽

Frequency Distribution ◽

Support Vector ◽

Eeg Signals ◽

Time Frequency ◽

Labeling Schemes ◽

Frequency Representation ◽

Frequency Features ◽

Time Frequency Distribution

Accurate recognition and understating of human emotions is an essential skill that can improve the collaboration between humans and machines. In this vein, electroencephalogram (EEG)-based emotion recognition is considered an active research field with challenging issues regarding the analyses of the nonstationary EEG signals and the extraction of salient features that can be used to achieve accurate emotion recognition. In this paper, an EEG-based emotion recognition approach with a novel time-frequency feature extraction technique is presented. In particular, a quadratic time-frequency distribution (QTFD) is employed to construct a high resolution time-frequency representation of the EEG signals and capture the spectral variations of the EEG signals over time. To reduce the dimensionality of the constructed QTFD-based representation, a set of 13 time- and frequency-domain features is extended to the joint time-frequency-domain and employed to quantify the QTFD-based time-frequency representation of the EEG signals. Moreover, to describe different emotion classes, we have utilized the 2D arousal-valence plane to develop four emotion labeling schemes of the EEG signals, such that each emotion labeling scheme defines a set of emotion classes. The extracted time-frequency features are used to construct a set of subject-specific support vector machine classifiers to classify the EEG signals of each subject into the different emotion classes that are defined using each of the four emotion labeling schemes. The performance of the proposed approach is evaluated using a publicly available EEG dataset, namely the DEAPdataset. Moreover, we design three performance evaluation analyses, namely the channel-based analysis, feature-based analysis and neutral class exclusion analysis, to quantify the effects of utilizing different groups of EEG channels that cover various regions in the brain, reducing the dimensionality of the extracted time-frequency features and excluding the EEG signals that correspond to the neutral class, on the capability of the proposed approach to discriminate between different emotion classes. The results reported in the current study demonstrate the efficacy of the proposed QTFD-based approach in recognizing different emotion classes. In particular, the average classification accuracies obtained in differentiating between the various emotion classes defined using each of the four emotion labeling schemes are within the range of 73 . 8 % – 86 . 2 % . Moreover, the emotion classification accuracies achieved by our proposed approach are higher than the results reported in several existing state-of-the-art EEG-based emotion recognition studies.

Download Full-text

Adaptive Domain-Aware Representation Learning for Speech Emotion Recognition

10.21437/interspeech.2020-2572 ◽

2020 ◽

Author(s):

Weiquan Fan ◽

Xiangmin Xu ◽

Xiaofen Xing ◽

Dongyan Huang

Keyword(s):

Emotion Recognition ◽

Representation Learning ◽

Speech Emotion Recognition

Download Full-text

Temporal Attention Convolutional Network for Speech Emotion Recognition with Latent Representation

10.21437/interspeech.2020-1520 ◽

2020 ◽

Author(s):

Jiaxing Liu ◽

Zhilei Liu ◽

Longbiao Wang ◽

Yuan Gao ◽

Lili Guo ◽

...

Keyword(s):

Emotion Recognition ◽

Speech Emotion Recognition ◽

Temporal Attention ◽

Convolutional Network

Download Full-text

Conditional Sound Generation Using Neural Discrete Time-Frequency Representation Learning

10.1109/mlsp52302.2021.9596430 ◽

2021 ◽

Author(s):

Xubo Liu ◽

Turab Iqbal ◽

Jinzheng Zhao ◽

Qiushi Huang ◽

Mark D. Plumbley ◽

...

Keyword(s):

Discrete Time ◽

Representation Learning ◽

Sound Generation ◽

Time Frequency ◽

Frequency Representation

Download Full-text

Survey of Deep Representation Learning for Speech Emotion Recognition

10.36227/techrxiv.16689484 ◽

2021 ◽

Author(s):

Siddique Latif ◽

Rajib Rana ◽

Sara Khalifa ◽

Raja Jurdak ◽

Junaid Qadir ◽

...

Keyword(s):

Emotion Recognition ◽

General Setting ◽

Representation Learning ◽

Data Driven ◽

Speech Emotion Recognition ◽

Feature Engineering ◽

Acoustic Features ◽

Learning Techniques ◽

Comprehensive Survey ◽

Hierarchical Representations

<div>Traditionally, speech emotion recognition (SER) research has relied on manually handcrafted acoustic features using feature engineering. However, the design of handcrafted features for complex SER tasks requires significant manual effort, which impedes generalisability and slows the pace of innovation. This has motivated the adoption of representation learning techniques that can automatically learn an intermediate representation of the input signal without any manual feature engineering. Representation learning has led to improved SER performance and enabled rapid innovation. Its effectiveness has further increased with advances in deep learning (DL), which has facilitated deep representation learning where hierarchical representations are automatically learned in a data-driven manner. This paper presents the first comprehensive survey on the important topic of deep representation learning for SER. We highlight various techniques, related challenges and identify important future areas of research. Our survey bridges the gap in the literature since existing surveys either focus on SER with hand-engineered features or representation learning in the general setting without focusing on SER.</div>

Download Full-text