Formant frequency estimation of high-pitched vowels using weighted linear prediction

2013 ◽  
Vol 134 (2) ◽  
pp. 1295-1313 ◽  
Author(s):  
Paavo Alku ◽  
Jouni Pohjalainen ◽  
Martti Vainio ◽  
Anne-Maria Laukkanen ◽  
Brad H. Story



2002 ◽  
Vol 16 (2) ◽  
pp. 147-171 ◽  
Author(s):  
Molly L. Erickson ◽  
Amy E. D'Alfonso


2020 ◽  
Vol 17 (1) ◽  
pp. 303-307
Author(s):  
S. Lalitha ◽  
Deepa Gupta

Mel Frequency Cepstral Coefficients (MFCCs) and Perceptual linear prediction coefficients (PLPCs) are widely casted nonlinear vocal parameters in majority of the speaker identification, speaker and speech recognition techniques as well in the field of emotion recognition. Post 1980s, significant exertions are put forth on for the progress of these features. Considerations like the usage of appropriate frequency estimation approaches, proposal of appropriate filter banks, and selection of preferred features perform a vital part for the strength of models employing these features. This article projects an overview of MFCC and PLPC features for different speech applications. The insights such as performance metrics of accuracy, background environment, type of data, and size of features are inspected and concise with the corresponding key references. Adding more to this, the advantages and shortcomings of these features have been discussed. This background work will hopefully contribute to floating a heading step in the direction of the enhancement of MFCC and PLPC with respect to novelty, raised levels of accuracy, and lesser complexity.



2012 ◽  
Vol 2012 ◽  
pp. 1-8 ◽  
Author(s):  
Mousmita Sarma ◽  
Kandarpa Kumar Sarma

In spoken word recognition, one of the crucial points is to identify the vowel phonemes. This paper describes an Artificial Neural Network (ANN) based algorithm developed for the segmentation and recognition of the vowel phonemes of Assamese language from some words containing those vowels. Self-Organizing Map (SOM) trained with a various number of iterations is used to segment the word into its constituent phonemes. Later, Probabilistic Neural Network (PNN) trained with clean vowel phonemes is used to recognize the vowel segment from the six different SOM segmented phonemes. One of the important aspects of the proposed algorithm is that it proves the validation of the recognized vowel by checking its first formant frequency. The first formant frequency of all the Assamese vowels is predetermined by estimating pole or formant location from the linear prediction (LP) model of the vocal tract. The proposed algorithm shows a high recognition performance in comparison to the conventional Discrete Wavelet Transform (DWT) based segmentation.



Sign in / Sign up

Export Citation Format

Share Document