Automatic emotion recognition of speech signal in Mandarin

Author(s):  
Sheng Zhang ◽  
P. C. Ching ◽  
Fanrang Kong
2013 ◽  
Vol 25 (12) ◽  
pp. 3294-3317 ◽  
Author(s):  
Lijiang Chen ◽  
Xia Mao ◽  
Pengfei Wei ◽  
Angelo Compare

This study proposes two classes of speech emotional features extracted from electroglottography (EGG) and speech signal. The power-law distribution coefficients (PLDC) of voiced segments duration, pitch rise duration, and pitch down duration are obtained to reflect the information of vocal folds excitation. The real discrete cosine transform coefficients of the normalized spectrum of EGG and speech signal are calculated to reflect the information of vocal tract modulation. Two experiments are carried out. One is of proposed features and traditional features based on sequential forward floating search and sequential backward floating search. The other is the comparative emotion recognition based on support vector machine. The results show that proposed features are better than those commonly used in the case of speaker-independent and content-independent speech emotion recognition.


2021 ◽  
pp. 1397-1405
Author(s):  
A. V. Mohan Kumar ◽  
H. V. Chaitra ◽  
S. Shalini ◽  
D. Shruthi

Author(s):  
Liqin Fu ◽  
Haiguang Zhai ◽  
Yongmei Zhang ◽  
Dan Yu

Author(s):  
Esther Ramdinmawii ◽  
Abhijit Mohanta ◽  
Vinay Kumar Mittal

2011 ◽  
Vol 121-126 ◽  
pp. 815-819 ◽  
Author(s):  
Yu Qiang Qin ◽  
Xue Ying Zhang

Ensemble empirical mode decomposition(EEMD) is a newly developed method aimed at eliminating mode mixing present in the original empirical mode decomposition (EMD). To evaluate the performance of this new method, this paper investigates the effect of two parameters pertinent to EEMD: the emotional envelop and the number of emotional ensemble trials. At the same time, the proposed technique has been utilized for four kinds of emotional(angry、happy、sad and neutral) speech signals, and compute the number of each emotional ensemble trials. We obtain an emotional envelope by transforming the IMFe of emotional speech signals, and obtain a new method of emotion recognition according to different emotional envelop and emotional ensemble trials.


2004 ◽  
Vol 14 (2) ◽  
pp. 150-155 ◽  
Author(s):  
Hyoun-Joo Go ◽  
Dae-Jong Lee ◽  
Jang-Hwan Park ◽  
Myung-Geun Chun

2021 ◽  
Vol 12 ◽  
Author(s):  
Xiang Chen ◽  
Rubing Huang ◽  
Xin Li ◽  
Lei Xiao ◽  
Ming Zhou ◽  
...  

Emotional design is an important development trend of interaction design. Emotional design in products plays a key role in enhancing user experience and inducing user emotional resonance. In recent years, based on the user's emotional experience, the design concept of strengthening product emotional design has become a new direction for most designers to improve their design thinking. In the emotional interaction design, the machine needs to capture the user's key information in real time, recognize the user's emotional state, and use a variety of clues to finally determine the appropriate user model. Based on this background, this research uses a deep learning mechanism for more accurate and effective emotion recognition, thereby optimizing the design of the interactive system and improving the user experience. First of all, this research discusses how to use user characteristics such as speech, facial expression, video, heartbeat, etc., to make machines more accurately recognize human emotions. Through the analysis of various characteristics, the speech is selected as the experimental material. Second, a speech-based emotion recognition method is proposed. The mel-Frequency cepstral coefficient (MFCC) of the speech signal is used as the input of the improved long and short-term memory network (ILSTM). To ensure the integrity of the information and the accuracy of the output at the next moment, ILSTM makes peephole connections in the forget gate and input gate of LSTM, and adds the unit state as input data to the threshold layer. The emotional features obtained by ILSTM are input into the attention layer, and the self-attention mechanism is used to calculate the weight of each frame of speech signal. The speech features with higher weights are used to distinguish different emotions and complete the emotion recognition of the speech signal. Experiments on the EMO-DB and CASIA datasets verify the effectiveness of the model for emotion recognition. Finally, the feasibility of emotional interaction system design is discussed.


Sign in / Sign up

Export Citation Format

Share Document