Speech Based Arithmetic Calculator Using Mel-Frequency Cepstral Coefficients and Gaussian Mixture Models

Author(s):  
Moula Husain ◽  
S. M. Meena ◽  
Manjunath K. Gonal
Author(s):  
AMITA PAL ◽  
SMARAJIT BOSE ◽  
GOPAL K. BASAK ◽  
AMITAVA MUKHOPADHYAY

For solving speaker identification problems, the approach proposed by Reynolds [IEEE Signal Process. Lett.2 (1995) 46–48], using Gaussian Mixture Models (GMMs) based on Mel Frequency Cepstral Coefficients (MFCCs) as features, is one of the most effective available in the literature. The use of GMMs for modeling speaker identity is motivated by the interpretation that the Gaussian components represent some general speaker-dependent spectral shapes, and also by the capability of Gaussian mixtures to model arbitrary densities. In this work, we have initially illustrated, with the help of a new bilingual speech corpus, how the well-known principal component transformation, in conjunction with the principle of classifier combination can be used to enhance the performance of the MFCC-GMM speaker recognition systems significantly. Subsequently, we have emphatically and rigorously established the same using the benchmark speech corpus NTIMIT. A significant outcome of this work is that the proposed approach has the potential to enhance the performance of any speaker recognition system based on correlated features.


2012 ◽  
Vol 2012 ◽  
pp. 1-10 ◽  
Author(s):  
Hesam Farsaie Alaie ◽  
Chakib Tadj

We make use of information inside infant’s cry signal in order to identify the infant’s psychological condition. Gaussian mixture models (GMMs) are applied to distinguish between healthy full-term and premature infants, and those with specific medical problems available in our cry database. Cry pattern for each pathological condition is created by using adapted boosting mixture learning (BML) method to estimate mixture model parameters. In the first experiment, test results demonstrate that the introduced adapted BML method for learning of GMMs has a better performance than conventional EM-based reestimation algorithm as a reference system in multipathological classification task. This newborn cry-based diagnostic system (NCDS) extracted Mel-frequency cepstral coefficients (MFCCs) as a feature vector for cry patterns of newborn infants. In binary classification experiment, the system discriminated a test infant’s cry signal into one of two groups, namely, healthy and pathological based on MFCCs. The binary classifier achieved a true positive rate of 80.77% and a true negative rate of 86.96% which show the ability of the system to correctly identify healthy and diseased infants, respectively.


Sign in / Sign up

Export Citation Format

Share Document