Average Modeling Approach to Voice Conversion with Non-Parallel Data

Mapping Intimacies ◽

10.21437/odyssey.2018-32 ◽

2018 ◽

Author(s):

Xiaohai Tian ◽

Junchao Wang ◽

Haihua Xu ◽

Eng-Siong Chng ◽

Haizhou Li

Keyword(s):

Voice Conversion ◽

Modeling Approach ◽

Download Full-text

Non-parallel dictionary learning for voice conversion using non-negative Tucker decomposition

EURASIP Journal on Audio Speech and Music Processing ◽

10.1186/s13636-019-0160-1 ◽

2019 ◽

Vol 2019 (1) ◽

Author(s):

Yuki Takashima ◽

Toru Nakashika ◽

Tetsuya Takiguchi ◽

Yasuo Ariki

Keyword(s):

Dictionary Learning ◽

Computational Cost ◽

Tensor Decomposition ◽

Gaussian Mixture ◽

Voice Conversion ◽

Specific Information ◽

Learning Method ◽

Tucker Decomposition ◽

Parallel Data ◽

High Computational Cost

Abstract Voice conversion (VC) is a technique of exclusively converting speaker-specific information in the source speech while preserving the associated phonemic information. Non-negative matrix factorization (NMF)-based VC has been widely researched because of the natural-sounding voice it achieves when compared with conventional Gaussian mixture model-based VC. In conventional NMF-VC, models are trained using parallel data which results in the speech data requiring elaborate pre-processing to generate parallel data. NMF-VC also tends to be an extensive model as this method has several parallel exemplars for the dictionary matrix, leading to a high computational cost. In this study, an innovative parallel dictionary-learning method using non-negative Tucker decomposition (NTD) is proposed. The proposed method uses tensor decomposition and decomposes an input observation into a set of mode matrices and one core tensor. The proposed NTD-based dictionary-learning method estimates the dictionary matrix for NMF-VC without using parallel data. The experimental results show that the proposed method outperforms other methods in both parallel and non-parallel settings.

Download Full-text

Sparse representation of phonetic features for voice conversion with and without parallel data

2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) ◽

10.1109/asru.2017.8269002 ◽

2017 ◽

Author(s):

Berrak Sisman ◽

Haizhou Li ◽

Kay Chen Tan

Keyword(s):

Sparse Representation ◽

Voice Conversion ◽

Parallel Data ◽

Phonetic Features

Download Full-text

Noise-robust voice conversion using a small parallel data based on non-negative matrix factorization

2015 23rd European Signal Processing Conference (EUSIPCO) ◽

10.1109/eusipco.2015.7362396 ◽

2015 ◽

Author(s):

Ryo Aihara ◽

Takao Fujii ◽

Toru Nakashika ◽

Tetsuya Takiguchi ◽

Yasuo Ariki

Keyword(s):

Matrix Factorization ◽

Voice Conversion ◽

Parallel Data ◽

Noise Robust ◽

Non Negative Matrix Factorization

Download Full-text

Rhythm-Flexible Voice Conversion Without Parallel Data Using Cycle-GAN Over Phoneme Posteriorgram Sequences

2018 IEEE Spoken Language Technology Workshop (SLT) ◽

10.1109/slt.2018.8639647 ◽

2018 ◽

Author(s):

Cheng-chieh Yeh ◽

Po-chun Hsu ◽

Ju-chieh Chou ◽

Hung-yi Lee ◽

Lin-shan Lee

Keyword(s):

Voice Conversion ◽

Download Full-text

Adversarially Trained Autoencoders for Parallel-data-free Voice Conversion

ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp.2019.8683204 ◽

2019 ◽

Author(s):

Orhan Ocal ◽

Oguz H. Elibol ◽

Gokce Keskin ◽

Cory Stephenson ◽

Anil Thomas ◽

...

Keyword(s):

Voice Conversion ◽

Download Full-text

Parallel-Data-Free Many-to-Many Voice Conversion Based on DNN Integrated with Eigenspace Using a Non-Parallel Speech Corpus

10.21437/interspeech.2017-961 ◽

2017 ◽

Author(s):

Tetsuya Hashimoto ◽

Hidetsugu Uchida ◽

Daisuke Saito ◽

Nobuaki Minematsu

Keyword(s):

Voice Conversion ◽

Speech Corpus ◽

Download Full-text

Voice conversion with parallel/non-parallel data and synthetic speech detection

10.32657/10220/47729 ◽

2019 ◽

Author(s):

Xiaohai Tian

Keyword(s):

Voice Conversion ◽

Synthetic Speech ◽

Speech Detection ◽

Download Full-text

Text-independent F0 transformation with non-parallel data for voice conversion

10.21437/interspeech.2010-497 ◽

2010 ◽

Author(s):

Zhi-Zheng Wu ◽

Tomi Kinnunen ◽

Eng Siong Chng ◽

Haizhou Li

Keyword(s):

Voice Conversion ◽

Download Full-text

Many-to-many and Completely Parallel-data-free Voice Conversion Based on Eigenspace DNN

IEEE/ACM Transactions on Audio Speech and Language Processing ◽

10.1109/taslp.2018.2878949 ◽

2018 ◽

pp. 1-1 ◽

Author(s):

Tetsuya Hashimoto ◽

Nobuaki Minematsu ◽

Daisuke Saito

Keyword(s):

Voice Conversion ◽

Download Full-text

Phonetic posteriorgrams for many-to-one voice conversion without parallel data training

2016 IEEE International Conference on Multimedia and Expo (ICME) ◽

10.1109/icme.2016.7552917 ◽

2016 ◽

Author(s):

Lifa Sun ◽

Kun Li ◽

Hao Wang ◽

Shiyin Kang ◽

Helen Meng

Keyword(s):

Voice Conversion ◽

Download Full-text