Comparative Study on the Performance of Mel-Frequency Cepstral Coefficients and Linear Prediction Cepstral Coefficients under different Speaker&apos;s Conditions

Birds are excellent environmental indicators and may indicate sustainability of the ecosystem; birds may be used to provide provisioning, regulating, and supporting services. Therefore, birdlife conservation-related researches always receive centre stage. Due to the airborne nature of birds and the dense nature of the tropical forest, bird identifications through audio may be a better solution than visual identification. The goal of this study is to find the most appropriate cepstral features that can be used to classify bird sounds more accurately. Fifteen (15) endemic Bornean bird sounds have been selected and segmented using an automated energy-based algorithm. Three (3) types of cepstral features are extracted; linear prediction cepstrum coefficients (LPCC), mel frequency cepstral coefficients (MFCC), gammatone frequency cepstral coefficients (GTCC), and used separately for classification purposes using support vector machine (SVM). Through comparison between their prediction results, it has been demonstrated that model utilising GTCC features, with 93.3% accuracy, outperforms models utilising MFCC and LPCC features. This demonstrates the robustness of GTCC for bird sounds classification. The result is significant for the advancement of bird sound classification research, which has been shown to have many applications such as in eco-tourism and wildlife management.

Download Full-text

Limited Data Speaker Verification: Fusion of Features

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v7i6.pp3344-3357 ◽

2017 ◽

Vol 7 (6) ◽

pp. 3344

Author(s):

T. R. Jayanthi Kumari ◽

H. S. Jayanna

Keyword(s):

Experimental Evaluation ◽

Linear Prediction ◽

Speaker Verification ◽

Gaussian Mixture ◽

Limited Data ◽

Mel Frequency Cepstral Coefficients ◽

Prediction Residual ◽

Individual Features ◽

Cepstral Coefficients ◽

Verification Techniques

<p>The present work demonstrates experimental evaluation of speaker verification for different speech feature extraction techniques with the constraints of limited data (less than 15 seconds). The state-of-the-art speaker verification techniques provide good performance for sufficient data (greater than 1 minutes). It is a challenging task to develop techniques which perform well for speaker verification under limited data condition. In this work different features like Mel Frequency Cepstral Coefficients (MFCC), Linear Prediction Cepstral Coefficients (LPCC), Delta (4), Delta-Delta (44), Linear Prediction Residual (LPR) and Linear Prediction Residual Phase (LPRP) are considered. The performance of individual features is studied and for better verification performance, combination of these features is attempted. A comparative study is made between Gaussian mixture model (GMM) and GMM-universal background model (GMM-UBM) through experimental evaluation. The experiments are conducted using NIST-2003 database. The experimental results show that, the combination of features provides better performance compared to the individual features. Further GMM-UBM modeling gives reduced equal error rate (EER) as compared to GMM.</p>

Download Full-text