Exploiting Morphological and Phonological Features to Improve Prosodic Phrasing for Mongolian Speech Synthesis

This paper aims to develop a Text-To-Speech (TTS) synthesis system for the holly quran recitation, to properly helps reciters and facilitates its use. In this work, the unit selection method is adopted and improved to reach a good speech quality. The proposed approach consists mainly of two steps. In the first one, an Expert System (ES) module is integrated by employing Arabic, Quran language, phonetic and phonological features. This part was considered as a preselection to optimize the synthesis algorithm's speed. The second step is the final selection of units by minimizing a concatenation cost function and a forward-backward dynamic programming search. The system is evaluated by native and non-native Arabic speakers. The results show that the goal of a correct Quran recitation by respecting its reading rules was reached, with 97 % of speech intelligibility and 72.13% of naturalness

Download Full-text

Phonological Features for 0-Shot Multilingual Speech Synthesis

10.21437/interspeech.2020-1821 ◽

2020 ◽

Author(s):

Marlene Staib ◽

Tian Huey Teh ◽

Alexandra Torresquintero ◽

Devang S. Ram Mohan ◽

Lorenzo Foglianti ◽

...

Keyword(s):

Speech Synthesis ◽

Phonological Features

Download Full-text

Speech synthesis from natural models by hand and by algorithm

PsycEXTRA Dataset ◽

10.1037/e520562012-289 ◽

2009 ◽

Author(s):

Robert E. Remez ◽

Kathryn R. Dubowski ◽

Morgana L. Davids ◽

Emily F. Thomas ◽

Nina Paddu ◽

...

Keyword(s):

Speech Synthesis

Download Full-text

Design of English text-to-speech conversion algorithm based on machine learning

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-189238 ◽

2020 ◽

pp. 1-12

Author(s):

Li Dongmei

Keyword(s):

Machine Learning ◽

Speech Synthesis ◽

Feature Recognition ◽

Learning Algorithm ◽

Morphological Structure ◽

English Text ◽

Text To Speech ◽

Part Of Speech ◽

Modern Computer ◽

Conversion Algorithm

English text-to-speech conversion is the key content of modern computer technology research. Its difficulty is that there are large errors in the conversion process of text-to-speech feature recognition, and it is difficult to apply the English text-to-speech conversion algorithm to the system. In order to improve the efficiency of the English text-to-speech conversion, based on the machine learning algorithm, after the original voice waveform is labeled with the pitch, this article modifies the rhythm through PSOLA, and uses the C4.5 algorithm to train a decision tree for judging pronunciation of polyphones. In order to evaluate the performance of pronunciation discrimination method based on part-of-speech rules and HMM-based prosody hierarchy prediction in speech synthesis systems, this study constructed a system model. In addition, the waveform stitching method and PSOLA are used to synthesize the sound. For words whose main stress cannot be discriminated by morphological structure, label learning can be done by machine learning methods. Finally, this study evaluates and analyzes the performance of the algorithm through control experiments. The results show that the algorithm proposed in this paper has good performance and has a certain practical effect.

Download Full-text

Integrating Articulatory Information in Deep Learning-Based Text-to-Speech Synthesis

10.21437/interspeech.2017-1762 ◽

2017 ◽

Cited By ~ 1

Author(s):

Beiming Cao ◽

Myungjong Kim ◽

Jan van Santen ◽

Ted Mau ◽

Jun Wang

Keyword(s):

Deep Learning ◽

Speech Synthesis ◽

Text To Speech ◽

Text To Speech Synthesis

Download Full-text