scholarly journals Transfer Learning of the Expressivity Using FLOW Metric Learning in Multispeaker Text-to-Speech Synthesis

Author(s):  
Ajinkya Kulkarni ◽  
Vincent Colotte ◽  
Denis Jouvet
Author(s):  
Ishita Satija ◽  
Vina Lomte ◽  
Yash Wani ◽  
Digisha Kaneria ◽  
Shubham Yadav

We portray a neural organization based framework for text-to-speech (TTS) combination that can create discourse sound in the voice of various speakers, including those concealed during preparation. Our framework comprises of three autonomously prepared parts: (1) a speaker encoder network; (2) a grouping to-succession union organization based on Tacotron 2; (3) an auto-backward Wave Net-based vocoder network. We illustrate that the proposed model can move the information on speaker fluctuation learned by the discriminatively-prepared speaker encoder to the multi speaker TTS task, and can incorporate normal discourse from speakers concealed during preparation. We measure the significance of preparing the speaker encoder on a huge and different speaker set to acquire the best speculation execution. At last, we show that haphazardly inspected speaker embeddings can be utilized to integrate discourse in the voice of novel speakers divergent from those utilized in preparing, showing that the model has taken in a top-notch speaker portrayal.


Author(s):  
Beiming Cao ◽  
Myungjong Kim ◽  
Jan van Santen ◽  
Ted Mau ◽  
Jun Wang

2019 ◽  
Author(s):  
Elshadai Tesfaye Biru ◽  
Yishak Tofik Mohammed ◽  
David Tofu ◽  
Erica Cooper ◽  
Julia Hirschberg

Sign in / Sign up

Export Citation Format

Share Document