Modification of Prosody for Emotion Conversion using Gaussian Regression Model

Emotion conversion is one of the most inspiring forefronts of research in the arena of emotional speech synthesis. The main focus of the work is to convert a neutral speech sentence to the target emotional speech sentence using signal processing techniques. The parameters used for emotion conversion are pitch contour and intensity along with the duration of the sentence. Kannada Emotional Speech (KES) Database is created and used for analysis. The database consists of 4 (sadness, happy, anger, and fear) emotions with neutral. The pitch contour of different emotional sentences are analyzed and Gaussian Regression Model (GRM) is proposed for predicting the target pitch contour. The evaluation of the proposed method is done using Objective test & Subjective test. For objective test, mean pitch, the standard deviation of pitch, mean intensity and duration of the sentences are used. Evaluation using a subjective test is performed by calculating Emotion Recognition Rate (ERR) with the help of confusion matrix and also by taking the Mean Opinion Score (MOS) rating of the conversion system on the scale of 1-5. The result of Subjective test indicates that the effectiveness and discernment of emotion are improved when GRM is used for pitch contour modification with intensity and duration. The most recognized emotion was sadness with MOS of 3.52 and ERR of 83% and the least recognized emotion was anger with MOS of 1.74 and ERR of 66%. The results of the subjective and objective test show that the converted sadness, happy and fear speech is seeming very close to usual sadness, anger and fear emotion.

Download Full-text

Burmese Emotional Speech Synthesis Based on Speech Parameter Adaptation

Computer Science and Application ◽

10.12677/csa.2022.121005 ◽

2022 ◽

Vol 12 (01) ◽

pp. 33-45

Author(s):

奇云刘

Keyword(s):

Speech Synthesis ◽

Emotional Speech ◽

Parameter Adaptation

Download Full-text

Emotional Speech Datasets for English Speech Synthesis Purpose: A Review

Advances in Intelligent Systems and Computing - Intelligent Systems and Applications ◽

10.1007/978-3-030-29516-5_6 ◽

2019 ◽

pp. 61-66

Author(s):

Noé Tits ◽

Kevin El Haddad ◽

Thierry Dutoit

Keyword(s):

Speech Synthesis ◽

Emotional Speech

Download Full-text

Interpretation of User Evaluation for Emotional Speech Synthesis System

Human-Computer Interaction. New Trends - Lecture Notes in Computer Science ◽

10.1007/978-3-642-02574-7_33 ◽

2009 ◽

pp. 295-303 ◽

Cited By ~ 3

Author(s):

Ho-Joon Lee ◽

Jong C. Park

Keyword(s):

Speech Synthesis ◽

User Evaluation ◽

Emotional Speech ◽

Synthesis System

Download Full-text

Improvement in corpus-based generation of F0 contours using generation process model for emotional speech synthesis

10.21437/interspeech.2004-332 ◽

2004 ◽

Author(s):

Keikichi Hirose

Keyword(s):

Speech Synthesis ◽

Process Model ◽

Generation Process ◽

Emotional Speech

Download Full-text

Speaker-dependent model interpolation for statistical emotional speech synthesis

EURASIP Journal on Audio Speech and Music Processing ◽

10.1186/1687-4722-2012-21 ◽

2012 ◽

Vol 2012 (1) ◽

Cited By ~ 1

Author(s):

Chih-Yu Hsu ◽

Chia-Ping Chen

Keyword(s):

Speech Synthesis ◽

Emotional Speech ◽

Dependent Model

Download Full-text

HMM-Based Emotional Speech Synthesis Using Average Emotion Model

Chinese Spoken Language Processing - Lecture Notes in Computer Science ◽

10.1007/11939993_27 ◽

2006 ◽

pp. 233-240 ◽

Cited By ~ 6

Author(s):

Long Qin ◽

Zhen-Hua Ling ◽

Yi-Jian Wu ◽

Bu-Fan Zhang ◽

Ren-Hua Wang

Keyword(s):

Speech Synthesis ◽

Emotional Speech ◽

Emotion Model

Download Full-text

Spectrum Modification for Emotional Speech Synthesis

Multimodal Signals: Cognitive and Algorithmic Issues - Lecture Notes in Computer Science ◽

10.1007/978-3-642-00525-1_23 ◽

2009 ◽

pp. 232-241 ◽

Cited By ~ 8

Author(s):

Anna Přibilová ◽

Jiří Přibil

Keyword(s):

Speech Synthesis ◽

Emotional Speech

Download Full-text

reports described him as ‘emotionally unstable’ and in a ‘grossly elevated neurotic state’. The judge refused to admit the evidence, and on appeal following conviction it was contended that he was wrong. The primary contention was that the appellant’s pre-existing mental condition made him vulnerable to threats. Held, dismissing the appeal, the duress relied upon was duress by threats, but in some cases a defendant might be able to rely on ‘duress by circumstances’ (see Conway [1989] QB 290; Martin [1989] 1 All ER 652), and although not argued in this way it was proposed to consider whether the medical evidence could have been introduced on the basis that Hegarty might have been able to set up such a defence. Duress by threats provided a defence to a charge of any offence other than murder (see Howe [1987] AC 417), attempted murder (see Gotts [1982] 2 AC 412) and some forms of treason. It was founded on public policy considerations (see AG v Whelan [1934] IR 518). The fact that the defendant’s mind had been ‘overborne’ by the threats did not mean that he lacked the requisite intent to commit the crime (see DPP for Northern Ireland v Lynch [1975] AC 653, 703B). It followed that the law might have developed on the lines that, when considering duress, a purely subjective test should be applied, and it might well develop in this way in the future (see Law Com 218, para 29.14, November 1993, Cmnd 2370 and draft Criminal Law Bill, cl 25(2)). As the law stood however the test was not purely subjective but required an objective test to be satisfied (Howe). The jury had to consider the response of a sober person of reasonable firmness ‘sharing the characteristics of the defendant’. They could take account of age, sex and physical health, but it was open to consideration whether the shared characteristics could include a personality disorder of the kind suffered by the appellant. His counsel argued that the expert evidence was relevant to explain the reaction of a man like him to threats of violence to himself and his family, and admissible because the pathological aspects of his personality and the effect of his disorder on his behaviour were matters which lay outside the knowledge and experience of a judge and jury. Counsel referred to a passage in Emery (1993) 14 Cr App R (S) 394, 398 where Lord Taylor CJ said that: ‘... The question for the doctors was whether a woman of reasonable firmness with the characteristics of [the appellant], if abused in the manner which she said, would have had her will crushed so that she could not have protected her child.’ It was accepted that for the purposes of the subjective test medical evidence was admissible if the mental condition or abnormality was relevant and its effects lay outside the knowledge and experience of laymen. In the present case, the reports before the judge did not go that far, and the judge had to decide on the material before him. There were no grounds for disturbing his decision. As the evidence was not admissible to explain the reaction of the appellant himself, it was clearly not admissible on the objective test. The passage cited could not be read in isolation,

Sourcebook Criminal Law ◽

10.4324/9781843143093-136 ◽

1996 ◽

pp. 568-568

Keyword(s):

Northern Ireland ◽

Criminal Law ◽

Mental Condition ◽

Objective Test ◽

Medical Evidence ◽

Subjective Test ◽

Attempted Murder ◽

The Law ◽

Set Up ◽

A Charge

Download Full-text

Speech Synthesis of Emotions Using Vowel Features

International Journal of Software Innovation ◽

10.4018/ijsi.2013010105 ◽

2013 ◽

Vol 1 (1) ◽

pp. 54-67

Author(s):

Kanu Boku ◽

Taro Asada ◽

Yasunari Yoshitomi ◽

Masayoshi Tabuse

Keyword(s):

Fundamental Frequency ◽

Speech Synthesis ◽

Male Subject ◽

Maximum Amplitude ◽

Synthetic Speech ◽

Emotional Speech ◽

Prosodic Features ◽

Initial Investigation ◽

Synthesis Research ◽

Case Based

Recently, methods for adding emotion to synthetic speech have received considerable attention in the field of speech synthesis research. For generating emotional synthetic speech, it is necessary to control the prosodic features of the utterances. The authors propose a case-based method for generating emotional synthetic speech by exploiting the characteristics of the maximum amplitude and the utterance time of vowels, and the fundamental frequency of emotional speech. As an initial investigation, they adopted the utterance of Japanese names, which are semantically neutral. By using the proposed method, emotional synthetic speech made from the emotional speech of one male subject was discriminable with a mean accuracy of 70% when ten subjects listened to the emotional synthetic utterances of “angry,” “happy,” “neutral,” “sad,” or “surprised” when the utterance was the Japanese name “Taro.”

Download Full-text

Emotional Speech Synthesis Based on Style Embedded Tacotron2 Framework

2019 34th International Technical Conference on Circuits/Systems, Computers and Communications (ITC-CSCC) ◽

10.1109/itc-cscc.2019.8793393 ◽

2019 ◽

Cited By ~ 1

Author(s):

Ohsung Kwon ◽

Inseon Jang ◽

ChungHyun Ahn ◽

Hong-Goo Kang

Keyword(s):

Speech Synthesis ◽

Emotional Speech

Download Full-text