Study on emotional speech features in Korean with its aplication to voice color conversion

Author(s):  
Sang-Jin Kim ◽  
Kwang-Ki Kim ◽  
Minsoo Hahn
Author(s):  
Hasrul Mohd Nazid ◽  
Hariharan Muthusamy ◽  
Vikneswaran Vijean ◽  
Sazali Yaacob

In the recent years, researchers are focusing to improve the accuracy of speech emotion recognition. Generally, high emotion recognition accuracies were obtained for two-class emotion recognition, but multi-class emotion recognition is still a challenging task . The main aim of this work is to propose a two-stage feature reduction using Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) for improving the accuracy of the speech emotion recognition (ER) system. Short-term speech features were extracted from the emotional speech signals. Experiments were carried out using four different supervised classifi ers with two different emotional speech databases. From the experimental results, it can be inferred that the proposed method provides better accuracies of 87.48% for speaker dependent (SD) and gender dependent (GD) ER experiment, 85.15% for speaker independent (SI) ER experiment, and 87.09% for gender independent (GI) experiment.  


2021 ◽  
Vol 11 (24) ◽  
pp. 11748
Author(s):  
Jiří Přibil ◽  
Anna Přibilová ◽  
Ivan Frollo

This paper deals with two modalities for stress detection and evaluation—vowel phonation speech signal and photo-plethysmography (PPG) signal. The main measurement is carried out in four phases representing different stress conditions for the tested person. The first and last phases are realized in laboratory conditions. The PPG and phonation signals are recorded inside the magnetic resonance imaging scanner working with a weak magnetic field up to 0.2 T in a silent state and/or with a running scan sequence during the middle two phases. From the recorded phonation signal, different speech features are determined for statistical analysis and evaluation by the Gaussian mixture models (GMM) classifier. A database of affective sounds and two databases of emotional speech were used for GMM creation and training. The second part of the developed method gives comparison of results obtained from the statistical description of the sensed PPG wave together with the determined heart rate and Oliva–Roztocil index values. The fusion of results obtained from both modalities gives the final stress level. The performed experiments confirm our working assumption that a fusion of both types of analysis is usable for this task—the final stress level values give better results than the speech or PPG signals alone.


2009 ◽  
Vol 129 (4) ◽  
pp. 686-695 ◽  
Author(s):  
Sho Yokota ◽  
Akinori Sasaki ◽  
Hiroshi Hashimoto ◽  
Yasuhiro Ohyama

2017 ◽  
Vol 5 (3) ◽  
pp. 20
Author(s):  
JEBISHA J ◽  
MONISHA V ◽  
JEMI B. FEMILA ◽  
◽  
◽  
...  

Author(s):  
A. S. Grigorev ◽  
V. A. Gorodnyi ◽  
O. V. Frolova ◽  
A. M. Kondratenko ◽  
V. D. Dolgaya ◽  
...  

Author(s):  
Brian Stasak ◽  
Julien Epps ◽  
Nicholas Cummins ◽  
Roland Goecke
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document