Application of differential evolution optimization based Gaussian Mixture Models to speaker recognition

Author(s):  
Hong Zhou ◽  
JianHua Zhang
2011 ◽  
Vol 474-476 ◽  
pp. 442-447
Author(s):  
Zhi Gao Zeng ◽  
Li Xin Ding ◽  
Sheng Qiu Yi ◽  
San You Zeng ◽  
Zi Hua Qiu

In order to improve the accuracy of the image segmentation in video surveillance sequences and to overcome the limits of the traditional clustering algorithms that can not accurately model the image data sets which Contains noise data, the paper presents an automatic and accurate video image segmentation algorithm, according to the spatial properties, which uses the Gaussian mixture models to segment the image. But the expectation-maximization algorithm is very sensitive to initial values, and easy to fall into local optimums, so the paper presents a differential evolution-based parameters estimation for Gaussian mixture models. The experiment result shows that the segmentation accuracy has been improved greatly than by the traditional segmentation algorithms.


Identification of a person’s voice from the different voices is known as speaker recognition. The speech signals of individuals are selected by means of speaker recognition or identification. In this work, an efficient method for speaker recognition is made by using Discrete Wavelet Transform (DWT) features and Gaussian Mixture Models (GMM) for classification is presented. The input speech signal features are decomposed by DWT into subband coefficients. The DWT subband coefficient features are the input for the classification. Classification is made by GMM classifier at 4, 8, 16 and 32 Gaussian component levels. Results show a better accuracy of 96.18% speaker signals using DWT features and GMM classifier


Author(s):  
AMITA PAL ◽  
SMARAJIT BOSE ◽  
GOPAL K. BASAK ◽  
AMITAVA MUKHOPADHYAY

For solving speaker identification problems, the approach proposed by Reynolds [IEEE Signal Process. Lett.2 (1995) 46–48], using Gaussian Mixture Models (GMMs) based on Mel Frequency Cepstral Coefficients (MFCCs) as features, is one of the most effective available in the literature. The use of GMMs for modeling speaker identity is motivated by the interpretation that the Gaussian components represent some general speaker-dependent spectral shapes, and also by the capability of Gaussian mixtures to model arbitrary densities. In this work, we have initially illustrated, with the help of a new bilingual speech corpus, how the well-known principal component transformation, in conjunction with the principle of classifier combination can be used to enhance the performance of the MFCC-GMM speaker recognition systems significantly. Subsequently, we have emphatically and rigorously established the same using the benchmark speech corpus NTIMIT. A significant outcome of this work is that the proposed approach has the potential to enhance the performance of any speaker recognition system based on correlated features.


Sign in / Sign up

Export Citation Format

Share Document