Application of differential evolution optimization based Gaussian Mixture Models to speaker recognition

In order to improve the accuracy of the image segmentation in video surveillance sequences and to overcome the limits of the traditional clustering algorithms that can not accurately model the image data sets which Contains noise data, the paper presents an automatic and accurate video image segmentation algorithm, according to the spatial properties, which uses the Gaussian mixture models to segment the image. But the expectation-maximization algorithm is very sensitive to initial values, and easy to fall into local optimums, so the paper presents a differential evolution-based parameters estimation for Gaussian mixture models. The experiment result shows that the segmentation accuracy has been improved greatly than by the traditional segmentation algorithms.

Download Full-text

Automatic speaker recognition using Gaussian mixture models

1999 Information, Decision and Control. Data and Information Fusion Symposium, Signal Processing and Communications Symposium and Decision and Control Symposium. Proceedings (Cat. No.99EX251) ◽

10.1109/idc.1999.754201 ◽

1999 ◽

Cited By ~ 8

Author(s):

W.J.J. Roberts ◽

J.P. Willmore

Keyword(s):

Mixture Models ◽

Speaker Recognition ◽

Gaussian Mixture Models ◽

Gaussian Mixture ◽

Automatic Speaker Recognition

Download Full-text

Speaker Recognition System Based on Wavelet Features and Gaussian Mixture Models

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.a3069.109119 ◽

2019 ◽

Vol 9 (1) ◽

pp. 5363-5367

Keyword(s):

Mixture Models ◽

Speaker Recognition ◽

Gaussian Mixture Models ◽

Gaussian Mixture ◽

Recognition System ◽

Gaussian Component ◽

Discrete Wavelet ◽

Signal Features ◽

Wavelet Features ◽

Gmm Classifier

Identification of a person’s voice from the different voices is known as speaker recognition. The speech signals of individuals are selected by means of speaker recognition or identification. In this work, an efficient method for speaker recognition is made by using Discrete Wavelet Transform (DWT) features and Gaussian Mixture Models (GMM) for classification is presented. The input speech signal features are decomposed by DWT into subband coefficients. The DWT subband coefficient features are the input for the classification. Classification is made by GMM classifier at 4, 8, 16 and 32 Gaussian component levels. Results show a better accuracy of 96.18% speaker signals using DWT features and GMM classifier

Download Full-text

Employment of Subspace Gaussian Mixture Models in speaker recognition

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp.2015.7178811 ◽

2015 ◽

Cited By ~ 14

Author(s):

Petr Motlicek ◽

Subhadeep Dey ◽

Srikanth Madikeri ◽

Lukas Burget

Keyword(s):

Mixture Models ◽

Speaker Recognition ◽

Gaussian Mixture Models ◽

Gaussian Mixture

Download Full-text

SPEAKER IDENTIFICATION BY AGGREGATING GAUSSIAN MIXTURE MODELS (GMMs) BASED ON UNCORRELATED MFCC-DERIVED FEATURES

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001414560060 ◽

2014 ◽

Vol 28 (04) ◽

pp. 1456006 ◽

Cited By ~ 2

Author(s):

AMITA PAL ◽

SMARAJIT BOSE ◽

GOPAL K. BASAK ◽

AMITAVA MUKHOPADHYAY

Keyword(s):

Mixture Models ◽

Speaker Recognition ◽

Speaker Identification ◽

Gaussian Mixture Models ◽

Principal Component ◽

Gaussian Mixture ◽

Recognition System ◽

Mel Frequency Cepstral Coefficients ◽

Speech Corpus ◽

Signal Process

For solving speaker identification problems, the approach proposed by Reynolds [IEEE Signal Process. Lett.2 (1995) 46–48], using Gaussian Mixture Models (GMMs) based on Mel Frequency Cepstral Coefficients (MFCCs) as features, is one of the most effective available in the literature. The use of GMMs for modeling speaker identity is motivated by the interpretation that the Gaussian components represent some general speaker-dependent spectral shapes, and also by the capability of Gaussian mixtures to model arbitrary densities. In this work, we have initially illustrated, with the help of a new bilingual speech corpus, how the well-known principal component transformation, in conjunction with the principle of classifier combination can be used to enhance the performance of the MFCC-GMM speaker recognition systems significantly. Subsequently, we have emphatically and rigorously established the same using the benchmark speech corpus NTIMIT. A significant outcome of this work is that the proposed approach has the potential to enhance the performance of any speaker recognition system based on correlated features.

Download Full-text