prosody generation Latest Research Papers

This paper describes the Isarn speech synthesis system, which is a regional dialect spoken in the Northeast of Thailand. In this study, we focus to improve the prosody generation of the system by using the additional context features. In order to develop the system, the speech parameters (Mel-ceptrum and fundamental frequencies of phoneme within different phonetic contexts) were modelled using Hidden Markov Models (HMM). Synthetic speech was generated by converting the input text into context-dependent phonemes. Speech parameters were generated from the trained HMM, according to the context-dependent phonemes, and were then synthesized through a speech vocoder. In this study, systems were trained using three different feature sets: basic contextual features, tonal, and syllable-context features. Objective and subjective tests were conducted to determine the performance of the proposed system. The results indicated that the addition of the syllable-context features significantly improved the naturalness of synthesized speech.

Download Full-text

Towards expressive prosody generation in TTS for reading aloud applications

10.21437/iberspeech.2018-9 ◽

2018 ◽

Author(s):

Monica Dominguez ◽

Alicia Burga ◽

Mireia Farrús ◽

Leo Wanner

Keyword(s):

Reading Aloud ◽

Prosody Generation

Download Full-text

Punctuation Generation Inspired Linguistic Features for Mandarin Prosody Generation

10.20944/preprints201802.0108.v1 ◽

2018 ◽

Author(s):

Chen-Yu Chiang ◽

Yu-Ping Hung ◽

Han-Yun Yeh ◽

I-Bin Liao ◽

Chen-Ming Pan

Keyword(s):

Conditional Random Field ◽

Word Boundary ◽

Acoustic Features ◽

Linguistic Features ◽

Text Input ◽

Highly Correlated ◽

The One ◽

Prosody Generation ◽

Better Than ◽

Word String

This paper proposes two fully-automatic machine-extracted linguistic features from an unlimited text input for Mandarin prosody generation. One is the punctuation confidence (PC) which measures the likelihood of inserting a major punctuation mark (PM) at a word boundary. Another is the quotation confidence (QC) which measures the likelihood of a word string to be quoted as a meaningful or emphasized unit in text. Because a major PM in a text is highly correlated with a prosodic break, and a quoted word string plays an important role in human language understanding, the two features potentially could provide useful information for prosody generation. The idea is first realized by employing conditional random field (CRF)-based models to predict major PMs, quoted word string locations, and their associated confidences, i.e., the PC and the QC, for each word boundary. Then, the predicted punctuations and their confidences are combined with traditional contextual linguistic features to predict prosodic-acoustic features. Both objective and subjective tests showed that the prosody generation with the proposed linguistic features performed better than the one without the proposed features. So, the proposed PC and QC are promising features for Mandarin prosody generation.

Download Full-text

Phonetically conditioned prosody generation for TTS: An unsupervised phonetic-to-prosodic mapping framework

2016 IEEE Annual India Conference (INDICON) ◽

10.1109/indicon.2016.7838863 ◽

2016 ◽

Author(s):

D.N. Krishna ◽

M.G. Khanum Noor Fathima ◽

Mythri Thippareddy ◽

A. Sricharan ◽

V. Ramasubramanian

Keyword(s):

Prosody Generation ◽

Prosodic Mapping

Download Full-text

prosody generation
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Dynamic Prosody Generation for Speech Synthesis Using Linguistics-Driven Acoustic Embedding Selection

Prosody Generation Using Back Propagation Neural Networks for Sindhi Speech Processing Applications

Subword tokenization based on DNN-based acoustic model for end-to-end prosody generation

Text-driven Visual Prosody Generation for Embodied Conversational Agents

Punctuation-generation-inspired linguistic features for Mandarin prosody generation

Investigation of Pitch and Duration Range in Speech of Sindhi Adults for Prosody Generation Module

Isarn Dialect Speech Synthesis using HMM with syllable-context features

Towards expressive prosody generation in TTS for reading aloud applications

Punctuation Generation Inspired Linguistic Features for Mandarin Prosody Generation

Phonetically conditioned prosody generation for TTS: An unsupervised phonetic-to-prosodic mapping framework

Export Citation Format

prosody generationRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Dynamic Prosody Generation for Speech Synthesis Using Linguistics-Driven Acoustic Embedding Selection

Prosody Generation Using Back Propagation Neural Networks for Sindhi Speech Processing Applications

Subword tokenization based on DNN-based acoustic model for end-to-end prosody generation

Text-driven Visual Prosody Generation for Embodied Conversational Agents

Punctuation-generation-inspired linguistic features for Mandarin prosody generation

Investigation of Pitch and Duration Range in Speech of Sindhi Adults for Prosody Generation Module

Isarn Dialect Speech Synthesis using HMM with syllable-context features

Towards expressive prosody generation in TTS for reading aloud applications

Punctuation Generation Inspired Linguistic Features for Mandarin Prosody Generation

Phonetically conditioned prosody generation for TTS: An unsupervised phonetic-to-prosodic mapping framework

prosody generation
Recently Published Documents