On the sensitivity of acoustic distance measures to different parameterizations of mel-frequency cepstral coefficients and temporal alignment algorithms

<p><span lang="EN-GB">In this paper, different feature extraction and feature normalization methods are investigated for speaker recognition. With a view to give a good representation of acoustic speech signals, Power Normalized Cepstral Coefficients (PNCCs) and Mel Frequency Cepstral Coefficients (MFCCs) are employed for feature extraction. Then, to mitigate the effect of linear channel, Cepstral Mean-Variance Normalization (CMVN) and feature warping are utilized. The current paper investigates Text-independent speaker identification system by using 16 coefficients from both the MFCCs and PNCCs features. Eight different speakers are selected from the GRID-Audiovisual database with two females and six males. The speakers are modeled using the coupling between the Universal Background Model and Gaussian Mixture Models (GMM-UBM) in order to get a fast scoring technique and better performance. The system shows 100% in terms of speaker identification accuracy. The results illustrated that PNCCs features have better performance compared to the MFCCs features to identify females compared to male speakers. Furthermore, feature wrapping reported better performance compared to the CMVN method. </span></p>

Download Full-text

Reconocimiento de emociones en el habla

TecnoLógicas ◽

10.22430/22565337.256 ◽

2008 ◽

pp. 113

Author(s):

Julián D. Echeverry-Correa ◽

Mauricio Morales-Pérez

Keyword(s):

Emotional Speech ◽

Mel Frequency Cepstral Coefficients ◽

Cepstral Coefficients

Se presenta en este trabajo una metodología para la caracterización de la señal de voz aplicada al reconocimiento de estados emocionales. Son estudiadas cuatro emociones primarias (alegría, enojo, sorpresa y tristeza) y un estado neutral. Se realizó un análisis en el dominio temporal y un análisis acústico empleando los MFCC (Mel Frequency Cepstral Coefficients). Las pruebas comprueban la efectividad de la metodología en el reconocimiento de las emociones superando el reconocimiento realizado por un grupo de personas. Se obtiene un porcentaje de 94.00% de acierto en el reconocimiento de emociones trabajando sobre la base de SES (Spanish emotional speech).

Download Full-text

Audio Detection using Mel-frequency Cepstral Coefficients

10.1109/icrito51393.2021.9596443 ◽

2021 ◽

Author(s):

Uppu Jithendra ◽

Usha Mittal ◽

Priyanka Chawla

Keyword(s):

Mel Frequency Cepstral Coefficients ◽

Cepstral Coefficients

Download Full-text

The Teager-Kaiser Energy Cepstral Coefficients as an Effective Structural Health Monitoring Tool

Applied Sciences ◽

10.3390/app9235064 ◽

2019 ◽

Vol 9 (23) ◽

pp. 5064 ◽

Cited By ~ 5

Author(s):

Marco Civera ◽

Matteo Ferraris ◽

Rosario Ceravolo ◽

Cecilia Surace ◽

Raimondo Betti

Keyword(s):

Experimental Data ◽

Structural Health Monitoring ◽

Health Monitoring ◽

Speech Processing ◽

Speaker Recognition ◽

Vibration Analysis ◽

Monitoring Tool ◽

Mel Frequency Cepstral Coefficients ◽

Structural Health ◽

Cepstral Coefficients

Recently, features and techniques from speech processing have started to gain increasing attention in the Structural Health Monitoring (SHM) community, in the context of vibration analysis. In particular, the Cepstral Coefficients (CCs) proved to be apt in discerning the response of a damaged structure with respect to a given undamaged baseline. Previous works relied on the Mel-Frequency Cepstral Coefficients (MFCCs). This approach, while efficient and still very common in applications, such as speech and speaker recognition, has been followed by other more advanced and competitive techniques for the same aims. The Teager-Kaiser Energy Cepstral Coefficients (TECCs) is one of these alternatives. These features are very closely related to MFCCs, but provide interesting and useful additional values, such as e.g., improved robustness with respect to noise. The goal of this paper is to introduce the use of TECCs for damage detection purposes, by highlighting their competitiveness with closely related features. Promising results from both numerical and experimental data were obtained.

Download Full-text

Modified mel-frequency cepstral coefficients (MMFCC) in robust text-dependent speaker identification

2017 4th International Conference on Advances in Electrical Engineering (ICAEE) ◽

10.1109/icaee.2017.8255408 ◽

2017 ◽

Cited By ~ 1

Author(s):

Md. Atiqul Islam

Keyword(s):

Speaker Identification ◽

Mel Frequency Cepstral Coefficients ◽

Cepstral Coefficients

Download Full-text