Text-To-Speech Synthesis Using Transfer Learning

International Journal of Advanced Research in Science, Communication and Technology ◽

10.48175/ijarsct-956 ◽

2021 ◽

pp. 139-144

Author(s):

Ishita Satija ◽

Vina Lomte ◽

Yash Wani ◽

Digisha Kaneria ◽

Shubham Yadav

Keyword(s):

Transfer Learning ◽

Speech Synthesis ◽

Text To Speech ◽

Neural Organization ◽

Proposed Model ◽

Backward Wave ◽

Text To Speech Synthesis ◽

We portray a neural organization based framework for text-to-speech (TTS) combination that can create discourse sound in the voice of various speakers, including those concealed during preparation. Our framework comprises of three autonomously prepared parts: (1) a speaker encoder network; (2) a grouping to-succession union organization based on Tacotron 2; (3) an auto-backward Wave Net-based vocoder network. We illustrate that the proposed model can move the information on speaker fluctuation learned by the discriminatively-prepared speaker encoder to the multi speaker TTS task, and can incorporate normal discourse from speakers concealed during preparation. We measure the significance of preparing the speaker encoder on a huge and different speaker set to acquire the best speculation execution. At last, we show that haphazardly inspected speaker embeddings can be utilized to integrate discourse in the voice of novel speakers divergent from those utilized in preparing, showing that the model has taken in a top-notch speaker portrayal.

Download Full-text

Generating the Voice of the Interactive Virtual Assistant

10.5772/intechopen.95510 ◽

2021 ◽

Author(s):

Adriana Stan ◽

Beáta Lőrincz

Keyword(s):

Speech Synthesis ◽

Text Processing ◽

Research Field ◽

Text To Speech ◽

Acoustic Modelling ◽

Research Problems ◽

Text To Speech Synthesis ◽

Main Components ◽

This chapter introduces an overview of the current approaches for generating spoken content using text-to-speech synthesis (TTS) systems, and thus the voice of an Interactive Virtual Assistant (IVA). The overview builds upon the issues which make spoken content generation a non-trivial task, and introduces the two main components of a TTS system: text processing and acoustic modelling. It then focuses on providing the reader with the minimally required scientific details of the terminology and methods involved in speech synthesis, yet with sufficient knowledge so as to be able to make the initial decisions regarding the choice of technology for the vocal identity of the IVA. The speech synthesis methodologies’ description begins with the basic, easy to run, low-requirement rule-based synthesis, and ends up within the state-of-the-art deep learning landscape. To bring this extremely complex and extensive research field closer to commercial deployment, an extensive indexing of the readily and freely available resources and tools required to build a TTS system is provided. Quality evaluation methods and open research problems are, as well, highlighted at end of the chapter.

Download Full-text

Transfer Learning of the Expressivity Using FLOW Metric Learning in Multispeaker Text-to-Speech Synthesis

10.21437/interspeech.2020-1297 ◽

2020 ◽

Author(s):

Ajinkya Kulkarni ◽

Vincent Colotte ◽

Denis Jouvet

Keyword(s):

Transfer Learning ◽

Speech Synthesis ◽

Metric Learning ◽

Text To Speech ◽

Text To Speech Synthesis

Download Full-text

Integrating Articulatory Information in Deep Learning-Based Text-to-Speech Synthesis

10.21437/interspeech.2017-1762 ◽

2017 ◽

Author(s):

Beiming Cao ◽

Myungjong Kim ◽

Jan van Santen ◽

Ted Mau ◽

Jun Wang

Keyword(s):

Deep Learning ◽

Speech Synthesis ◽

Text To Speech ◽

Text To Speech Synthesis

Download Full-text

Subset Selection, Adaptation, Gemination and Prosody Prediction for Amharic Text-to-Speech Synthesis

10.21437/ssw.2019-37 ◽

2019 ◽

Author(s):

Elshadai Tesfaye Biru ◽

Yishak Tofik Mohammed ◽

David Tofu ◽

Erica Cooper ◽

Julia Hirschberg

Keyword(s):

Speech Synthesis ◽

Subset Selection ◽

Text To Speech ◽

Text To Speech Synthesis ◽

Prosody Prediction

Download Full-text

“I Can’t Talk Now”: Speaking with Voice Output Communication Aid Using Text-to-Speech Synthesis During Multiparty Video Conference

Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems ◽

10.1145/3411763.3451745 ◽

2021 ◽

Author(s):

Wooseok Kim ◽

Sangsu Lee

Keyword(s):

Speech Synthesis ◽

Video Conference ◽

Text To Speech ◽

Voice Output Communication Aid ◽

Communication Aid ◽

Text To Speech Synthesis ◽

Download Full-text

Comparative Study on Neural Vocoders for Multispeaker Text-To-Speech Synthesis

2020 IEEE Recent Advances in Intelligent Computational Systems (RAICS) ◽

10.1109/raics51191.2020.9332514 ◽

2020 ◽

Author(s):

Rajeev Rajan ◽

Ashish Roopan ◽

Sachin Prakash ◽

Elisa Jose ◽

Sati P.

Keyword(s):

Comparative Study ◽

Speech Synthesis ◽

Text To Speech ◽

Text To Speech Synthesis

Download Full-text

Comparison of Urdu text to speech synthesis using unit selection and HMM based techniques

2016 Conference of The Oriental Chapter of International Committee for Coordination and Standardization of Speech Databases and Assessment Techniques (O-COCOSDA) ◽

10.1109/icsda.2016.7918988 ◽

2016 ◽

Author(s):

Farah Adeeba ◽

Tania Habib ◽

Sarmad Hussain ◽

Ehsan-ul-haq ◽

Kh. Shahzada Shahid

Keyword(s):

Speech Synthesis ◽

Text To Speech ◽

Unit Selection ◽

Text To Speech Synthesis

Download Full-text

Comparative study of text-to-speech synthesis techniques for mobile linguistic translation process

2014 IEEE International Conference on Control System, Computing and Engineering (ICCSCE 2014) ◽

10.1109/iccsce.2014.7072761 ◽

2014 ◽

Author(s):

Phanchita Chomwihoke ◽

Manop Phankokkruad

Keyword(s):

Comparative Study ◽

Speech Synthesis ◽

Text To Speech ◽

Translation Process ◽

Synthesis Techniques ◽

Text To Speech Synthesis

Download Full-text

The future role of text to speech synthesis in automated services

10.1049/ic:19970799 ◽

1997 ◽

Author(s):

A.P. Breen

Keyword(s):

Speech Synthesis ◽

Text To Speech ◽

Future Role ◽

Text To Speech Synthesis

Download Full-text

An advanced NLP framework for high-quality Text-to-Speech synthesis

2011 6th Conference on Speech Technology and Human-Computer Dialogue (SpeD) ◽

10.1109/sped.2011.5940733 ◽

2011 ◽

Author(s):

Catalin Ungurean ◽

Dragos Burileanu

Keyword(s):

Speech Synthesis ◽

Text To Speech ◽

High Quality ◽

Text To Speech Synthesis

Download Full-text