scholarly journals Storytelling Voice Conversion: Evaluation Experiment Using Gaussian Mixture Models

2015 ◽  
Vol 66 (4) ◽  
pp. 194-202
Author(s):  
Jiří Přibil ◽  
Anna Přibilová ◽  
Daniela Ďuračková

AbstractIn the development of the voice conversion and personification of the text-to-speech (TTS) systems, it is very necessary to have feedback information about the users’ opinion on the resulting synthetic speech quality. Therefore, the main aim of the experiments described in this paper was to find out whether the classifier based on Gaussian mixture models (GMM) could be applied for evaluation of different storytelling voices created by transformation of the sentences generated by the Czech and Slovak TTS system. We suppose that it is possible to combine this GMM-based statistical evaluation with the classical one in the form of listening tests or it can replace them. The results obtained in this way were in good correlation with the results of the conventional listening test, so they confirm practical usability of the developed GMM classifier. With the help of the performed analysis, the optimal setting of the initial parameters and the structure of the input feature set for recognition of the storytelling voices was finally determined.

2010 ◽  
Vol E93-D (9) ◽  
pp. 2472-2482 ◽  
Author(s):  
Hironori DOI ◽  
Keigo NAKAMURA ◽  
Tomoki TODA ◽  
Hiroshi SARUWATARI ◽  
Kiyohiro SHIKANO

2015 ◽  
Vol 30 (1) ◽  
pp. 3-15 ◽  
Author(s):  
Daniel Erro ◽  
Agustin Alonso ◽  
Luis Serrano ◽  
Eva Navas ◽  
Inma Hernaez

2018 ◽  
Vol 32 (34n36) ◽  
pp. 1840096
Author(s):  
Jingyi Bao ◽  
Ning Xu

Voice conversion (VC) is a technique that aims to transform the individuality of a source speech so as to mimic that of a target speech while keeping the message unaltered. In our previous work, Gaussian process (GP) was introduced into the literature of VC for the first time, for the sake of overcoming the “over-fitting” problem inherent in the state-of-the-art VC methods, which gives very promising results. However, standard GP usually acts as somewhat a smoothing device more than a universal approximator. In this paper, we further attempt to improve the flexibility of GP-based VC by resorting to the expressive kernels that are derived to model the spectral density with Gaussian mixture model (GMM). Our new method benefits from the expressiveness of the new kernel while the inference of GP remains simple and analytic as usual. Experiments demonstrate both objectively and subjectively that the individualities of the converted speech are much more closer to those of the target while speech quality obtained is comparable to the standard GP-based method.


2017 ◽  
Vol 34 (10) ◽  
pp. 1399-1414 ◽  
Author(s):  
Wanxia Deng ◽  
Huanxin Zou ◽  
Fang Guo ◽  
Lin Lei ◽  
Shilin Zhou ◽  
...  

2013 ◽  
Vol 141 (6) ◽  
pp. 1737-1760 ◽  
Author(s):  
Thomas Sondergaard ◽  
Pierre F. J. Lermusiaux

Abstract This work introduces and derives an efficient, data-driven assimilation scheme, focused on a time-dependent stochastic subspace that respects nonlinear dynamics and captures non-Gaussian statistics as it occurs. The motivation is to obtain a filter that is applicable to realistic geophysical applications, but that also rigorously utilizes the governing dynamical equations with information theory and learning theory for efficient Bayesian data assimilation. Building on the foundations of classical filters, the underlying theory and algorithmic implementation of the new filter are developed and derived. The stochastic Dynamically Orthogonal (DO) field equations and their adaptive stochastic subspace are employed to predict prior probabilities for the full dynamical state, effectively approximating the Fokker–Planck equation. At assimilation times, the DO realizations are fit to semiparametric Gaussian Mixture Models (GMMs) using the Expectation-Maximization algorithm and the Bayesian Information Criterion. Bayes’s law is then efficiently carried out analytically within the evolving stochastic subspace. The resulting GMM-DO filter is illustrated in a very simple example. Variations of the GMM-DO filter are also provided along with comparisons with related schemes.


Sign in / Sign up

Export Citation Format

Share Document