Voice conversion based on Gaussian processes by using kernels modeling the spectral density with Gaussian mixture models

Voice conversion (VC) is a technique that aims to transform the individuality of a source speech so as to mimic that of a target speech while keeping the message unaltered. In our previous work, Gaussian process (GP) was introduced into the literature of VC for the first time, for the sake of overcoming the “over-fitting” problem inherent in the state-of-the-art VC methods, which gives very promising results. However, standard GP usually acts as somewhat a smoothing device more than a universal approximator. In this paper, we further attempt to improve the flexibility of GP-based VC by resorting to the expressive kernels that are derived to model the spectral density with Gaussian mixture model (GMM). Our new method benefits from the expressiveness of the new kernel while the inference of GP remains simple and analytic as usual. Experiments demonstrate both objectively and subjectively that the individualities of the converted speech are much more closer to those of the target while speech quality obtained is comparable to the standard GP-based method.

Download Full-text

Esophageal Speech Enhancement Based on Statistical Voice Conversion with Gaussian Mixture Models

IEICE Transactions on Information and Systems ◽

10.1587/transinf.e93.d.2472 ◽

2010 ◽

Vol E93-D (9) ◽

pp. 2472-2482 ◽

Cited By ~ 8

Author(s):

Hironori DOI ◽

Keigo NAKAMURA ◽

Tomoki TODA ◽

Hiroshi SARUWATARI ◽

Kiyohiro SHIKANO

Keyword(s):

Mixture Models ◽

Speech Enhancement ◽

Gaussian Mixture Models ◽

Gaussian Mixture ◽

Voice Conversion ◽

Esophageal Speech

Download Full-text

State of the art discriminative training of subspace constrained Gaussian mixture models in big training corpora

2013 IEEE International Conference on Acoustics, Speech and Signal Processing ◽

10.1109/icassp.2013.6639008 ◽

2013 ◽

Author(s):

Jing Huang ◽

Peder A. Olsen ◽

Vaibhava Goel

Keyword(s):

Mixture Models ◽

State Of The Art ◽

Gaussian Mixture Models ◽

Gaussian Mixture ◽

Discriminative Training

Download Full-text

Interpretable parametric voice conversion functions based on Gaussian mixture models and constrained transformations

Computer Speech & Language ◽

10.1016/j.csl.2014.03.001 ◽

2015 ◽

Vol 30 (1) ◽

pp. 3-15 ◽

Cited By ~ 5

Author(s):

Daniel Erro ◽

Agustin Alonso ◽

Luis Serrano ◽

Eva Navas ◽

Inma Hernaez

Keyword(s):

Mixture Models ◽

Gaussian Mixture Models ◽

Gaussian Mixture ◽

Voice Conversion

Download Full-text

Storytelling Voice Conversion: Evaluation Experiment Using Gaussian Mixture Models

Journal of Electrical Engineering ◽

10.2478/jee-2015-0032 ◽

2015 ◽

Vol 66 (4) ◽

pp. 194-202

Author(s):

Jiří Přibil ◽

Anna Přibilová ◽

Daniela Ďuračková

Keyword(s):

Mixture Models ◽

Statistical Evaluation ◽

Gaussian Mixture Models ◽

Gaussian Mixture ◽

Voice Conversion ◽

Feedback Information ◽

Input Feature ◽

Listening Tests ◽

Optimal Setting ◽

The Voice

AbstractIn the development of the voice conversion and personification of the text-to-speech (TTS) systems, it is very necessary to have feedback information about the users’ opinion on the resulting synthetic speech quality. Therefore, the main aim of the experiments described in this paper was to find out whether the classifier based on Gaussian mixture models (GMM) could be applied for evaluation of different storytelling voices created by transformation of the sentences generated by the Czech and Slovak TTS system. We suppose that it is possible to combine this GMM-based statistical evaluation with the classical one in the form of listening tests or it can replace them. The results obtained in this way were in good correlation with the results of the conventional listening test, so they confirm practical usability of the developed GMM classifier. With the help of the performed analysis, the optimal setting of the initial parameters and the structure of the input feature set for recognition of the storytelling voices was finally determined.

Download Full-text

Voice conversion using Gaussian Mixture Models

2015 International Conference on Communication, Information & Computing Technology (ICCICT) ◽

10.1109/iccict.2015.7045743 ◽

2015 ◽

Author(s):

Kevin D'souza ◽

K.T.V Talele

Keyword(s):

Mixture Models ◽

Gaussian Mixture Models ◽

Gaussian Mixture ◽

Voice Conversion

Download Full-text

Efficient Greedy Learning of Gaussian Mixture Models

Neural Computation ◽

10.1162/089976603762553004 ◽

2003 ◽

Vol 15 (2) ◽

pp. 469-485 ◽

Cited By ~ 205

Author(s):

J. J. Verbeek ◽

N. Vlassis ◽

B. Kröse

Keyword(s):

Mixture Models ◽

Density Estimation ◽

Expectation Maximization ◽

State Of The Art ◽

Gaussian Mixture Models ◽

Gaussian Mixture ◽

Optimal Number ◽

Gaussian Mixtures ◽

Data Points ◽

Time Linear

This article concerns the greedy learning of gaussian mixtures. In the greedy approach, mixture components are inserted into the mixture one aftertheother.We propose a heuristic for searching for the optimal component to insert. In a randomized manner, a set of candidate new components is generated. For each of these candidates, we find the locally optimal new component and insert it into the existing mixture. The resulting algorithm resolves the sensitivity to initialization of state-of-the-art methods, like expectation maximization, and has running time linear in the number of data points and quadratic in the (final) number of mixture components. Due to its greedy nature, the algorithm can be particularly useful when the optimal number of mixture components is unknown. Experimental results comparing the proposed algorithm to other methods on density estimation and texture segmentation are provided.

Download Full-text