Fine-Tuning Pre-Trained Voice Conversion Model for Adding New Target Speakers with Limited Data

Mapping Intimacies ◽

10.21437/interspeech.2021-244 ◽

2021 ◽

Author(s):

Takeshi Koshizuka ◽

Hidefumi Ohmura ◽

Kouichi Katsurada

Keyword(s):

Fine Tuning ◽

Voice Conversion ◽

Limited Data ◽

Conversion Model

Download Full-text

Factorized WaveNet for voice conversion with limited data

Speech Communication ◽

10.1016/j.specom.2021.03.003 ◽

2021 ◽

Vol 130 ◽

pp. 45-54

Author(s):

Hongqiang Du ◽

Xiaohai Tian ◽

Lei Xie ◽

Haizhou Li

Keyword(s):

Voice Conversion ◽

Download Full-text

Cross-gender Voice Conversion with Constant F0-Ratio and Average Background Conversion Model

ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp.2019.8683369 ◽

2019 ◽

Author(s):

Zbigniew Latka ◽

Jakub Galka ◽

Bartosz Ziolko

Keyword(s):

Voice Conversion ◽

Conversion Model

Download Full-text

Taco-VC: A Single Speaker Tacotron based Voice Conversion with Limited Data

2020 28th European Signal Processing Conference (EUSIPCO) ◽

10.23919/eusipco47968.2020.9287448 ◽

2021 ◽

Author(s):

Roee Levy-Leshem ◽

Raja Giryes

Keyword(s):

Voice Conversion ◽

Limited Data ◽

Download Full-text

Effective Wavenet Adaptation for Voice Conversion with Limited Data

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp40776.2020.9053315 ◽

2020 ◽

Author(s):

Hongqiang Du ◽

Xiaohai Tian ◽

Lei Xie ◽

Haizhou Li

Keyword(s):

Voice Conversion ◽

Download Full-text

Voice Conversion towards Arbitrary Speakers With Limited Data

Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence ◽

10.1145/3404555.3404627 ◽

2020 ◽

Author(s):

Ying Zhang ◽

Wenjun Zhang ◽

Dandan Song

Keyword(s):

Voice Conversion ◽

Download Full-text

Voice Conversion Based Data Augmentation to Improve Children’s Speech Recognition in Limited Data Scenario

10.21437/interspeech.2020-1112 ◽

2020 ◽

Author(s):

S. Shahnawazuddin ◽

Nagaraj Adiga ◽

Kunal Kumar ◽

Aayushi Poddar ◽

Waquar Ahmad

Keyword(s):

Speech Recognition ◽

Data Augmentation ◽

Voice Conversion ◽

Limited Data ◽

Children’S Speech Recognition ◽

Children's Speech

Download Full-text

Jointly Trained Conversion Model and WaveNet Vocoder for Non-Parallel Voice Conversion Using Mel-Spectrograms and Phonetic Posteriorgrams

10.21437/interspeech.2019-1316 ◽

2019 ◽

Author(s):

Songxiang Liu ◽

Yuewen Cao ◽

Xixin Wu ◽

Lifa Sun ◽

Xunying Liu ◽

...

Keyword(s):

Voice Conversion ◽

Conversion Model

Download Full-text

Vowels and Prosody Contribution in Neural Network Based Voice Conversion Algorithm with Noisy Training Data

European Journal of Engineering Research and Science ◽

10.24018/ejers.2020.5.3.1802 ◽

2020 ◽

Vol 5 (3) ◽

pp. 229-233

Author(s):

Olaide Ayodeji Agbolade

Keyword(s):

Neural Network ◽

Significant Contribution ◽

Linear Prediction ◽

Feedforward Neural Network ◽

Training Data ◽

Voice Conversion ◽

Conversion Model ◽

The Voice ◽

Average Noise Level ◽

This research presents a neural network based voice conversion model. While it is a known fact that voiced sounds and prosody are the most important component of the voice conversion framework, what is not known is their objective contributions particularly in a noisy and uncontrolled environment. This model uses a 3 layer feedforward neural network to map the Linear prediction analysis coefficients of a source speaker to the acoustic vector space of the target speaker with a view to objectively determine the contributions of the voiced, unvoiced and supra-segmental components of sounds to the voice conversion model. Results showed that vowels “a”, “i”, “o” have the most significant contribution in the conversion success. The voiceless sounds were also found to be most affected by the noisy training data. An average noise level of 40 dB above the noise floor were found to degrade the voice conversion success by 55.14 percent relative to the voiced sounds. The result also show that for cross-gender voice conversion, prosody conversion is more significant in scenarios where a female is the target speaker.

Download Full-text

Limited Data Emotional Voice Conversion Leveraging Text-to-Speech: Two-Stage Sequence-to-Sequence Training

10.21437/interspeech.2021-781 ◽

2021 ◽

Author(s):

Kun Zhou ◽

Berrak Sisman ◽

Haizhou Li

Keyword(s):

Voice Conversion ◽

Limited Data ◽

Text To Speech ◽

Download Full-text

Non-Parallel Voice Conversion with Autoregressive Conversion Model and Duration Adjustment

10.21437/vcc_bc.2020-17 ◽

2020 ◽

Author(s):

Li-Juan Liu ◽

Yan-Nian Chen ◽

Jing-Xuan Zhang ◽

Yuan Jiang ◽

Ya-Jun Hu ◽

...

Keyword(s):

Voice Conversion ◽

Conversion Model

Download Full-text