Emotion Analysis from Human Voice Using Various Prosodic Features and Text Analysis
Emotion Analysis is a dynamic field of research with the aim to provide a method to recognize the emotions of a person only from their voice. It is more famously recognized as the Speech Emotion Recognition (SER) problem. This problem has been studied upon from more than a decade with results coming from either Voice Analysis or Text Analysis. Individually, both these methods have shown a good accuracy up till now. But, the use of both of these methods in unison has showed a much more better result than either one of those parts considered individually. When different people of different age groups are talking, it is important to understand their emotions behind what they say as this will in turn help us in reacting better. To try and achieve this, the paper implements a model which performs Emotion Analysis based on both Tone and Text Analysis. The prosodic features of the tone are analyzed and then the speech is converted to text. Once the text has been extracted from the speech, Sentiment Analysis is done on the extracted text to further improve the accuracy of the Emotion Recognition.