Attention gated tensor neural network architectures for speech emotion recognition

2022 ◽  
Vol 71 ◽  
pp. 103173
Author(s):  
Sandeep Kumar Pandey ◽  
Hanumant Singh Shekhawat ◽  
S.R.M Prasanna
2020 ◽  
Author(s):  
Ronnypetson Da Silva ◽  
Valter M. Filho ◽  
Mario Souza

Many works that apply Deep Neural Networks (DNNs) to Speech Emotion Recognition (SER) use single datasets or train and evaluate the models separately when using multiple datasets. Those datasets are constructed with specific guidelines and the subjective nature of the labels for SER makes it difficult to obtain robust and general models. We investigate how DNNs learn shared representations for different datasets in both multi-task and unified setups. We also analyse how each dataset benefits from others in different combinations of datasets and popular neural network architectures. We show that the longstanding belief of more data resulting in more general models doesn’t always hold for SER, as different dataset and meta-parameter combinations hold the best result for each of the analysed datasets.


Sign in / Sign up

Export Citation Format

Share Document