Perceptual MVDR-based cepstral coefficients (PMCCs) for speaker recognition

Author(s):  
Chunyan Liang ◽  
Xiang Zhang ◽  
Lin Yang ◽  
Jianping Zhang ◽  
Yonghong Yan
Author(s):  
Musab T. S. Al-Kaltakchi ◽  
Haithem Abd Al-Raheem Taha ◽  
Mohanad Abd Shehab ◽  
Mohamed A.M. Abdullah

<p><span lang="EN-GB">In this paper, different feature extraction and feature normalization methods are investigated for speaker recognition. With a view to give a good representation of acoustic speech signals, Power Normalized Cepstral Coefficients (PNCCs) and Mel Frequency Cepstral Coefficients (MFCCs) are employed for feature extraction. Then, to mitigate the effect of linear channel, Cepstral Mean-Variance Normalization (CMVN) and feature warping are utilized. The current paper investigates Text-independent speaker identification system by using 16 coefficients from both the MFCCs and PNCCs features. Eight different speakers are selected from the GRID-Audiovisual database with two females and six males. The speakers are modeled using the coupling between the Universal Background Model and Gaussian Mixture Models (GMM-UBM) in order to get a fast scoring technique and better performance. The system shows 100% in terms of speaker identification accuracy. The results illustrated that PNCCs features have better performance compared to the MFCCs features to identify females compared to male speakers. Furthermore, feature wrapping reported better performance compared to the CMVN method. </span></p>


2019 ◽  
Vol 9 (23) ◽  
pp. 5064 ◽  
Author(s):  
Marco Civera ◽  
Matteo Ferraris ◽  
Rosario Ceravolo ◽  
Cecilia Surace ◽  
Raimondo Betti

Recently, features and techniques from speech processing have started to gain increasing attention in the Structural Health Monitoring (SHM) community, in the context of vibration analysis. In particular, the Cepstral Coefficients (CCs) proved to be apt in discerning the response of a damaged structure with respect to a given undamaged baseline. Previous works relied on the Mel-Frequency Cepstral Coefficients (MFCCs). This approach, while efficient and still very common in applications, such as speech and speaker recognition, has been followed by other more advanced and competitive techniques for the same aims. The Teager-Kaiser Energy Cepstral Coefficients (TECCs) is one of these alternatives. These features are very closely related to MFCCs, but provide interesting and useful additional values, such as e.g., improved robustness with respect to noise. The goal of this paper is to introduce the use of TECCs for damage detection purposes, by highlighting their competitiveness with closely related features. Promising results from both numerical and experimental data were obtained.


2019 ◽  
Vol 17 (2) ◽  
pp. 170-177
Author(s):  
Lei Deng ◽  
Yong Gao

In this paper, authors propose an auditory feature extraction algorithm in order to improve the performance of the speaker recognition system in noisy environments. In this auditory feature extraction algorithm, the Gammachirp filter bank is adapted to simulate the auditory model of human cochlea. In addition, the following three techniques are applied: cube-root compression method, Relative Spectral Filtering Technique (RASTA), and Cepstral Mean and Variance Normalization algorithm (CMVN).Subsequently, based on the theory of Gaussian Mixes Model-Universal Background Model (GMM-UBM), the simulated experiment was conducted. The experimental results implied that speaker recognition systems with the new auditory feature has better robustness and recognition performance compared to Mel-Frequency Cepstral Coefficients(MFCC), Relative Spectral-Perceptual Linear Predictive (RASTA-PLP),Cochlear Filter Cepstral Coefficients (CFCC) and gammatone Frequency Cepstral Coefficeints (GFCC)


2016 ◽  
Author(s):  
Víctor Poblete ◽  
Juan Pablo Escudero ◽  
Josué Fredes ◽  
José Novoa ◽  
Richard M. Stern ◽  
...  

2015 ◽  
Vol 2015 ◽  
pp. 1-21 ◽  
Author(s):  
Surendra Thakur ◽  
Emmanuel Adetiba ◽  
Oludayo O. Olugbara ◽  
Richard Millham

We propose a secure mobile Internet voting architecture based on the Sensus reference architecture and report the experiments carried out using short-term spectral features for realizing the voice biometric based authentication module of the architecture being proposed. The short-term spectral features investigated are Mel-Frequency Cepstral Coefficients (MFCCs), Mel-Frequency Discrete Wavelet Coefficients (MFDWC), Linear Predictive Cepstral Coefficients (LPCC), and Spectral Histogram of Oriented Gradients (SHOGs). The MFCC, MFDWC, and LPCC usually have higher dimensions that oftentimes lead to high computational complexity of the pattern matching algorithms in automatic speaker recognition systems. In this study, higher dimensions of each of the short-term features were reduced to an 81-element feature vector per Speaker using Histogram of Oriented Gradients (HOG) algorithm while neural network ensemble was utilized as the pattern matching algorithm. Out of the four short-term spectral features investigated, the LPCC-HOG gave the best statistical results withRstatistic of 0.9127 and mean square error of 0.0407. These compact LPCC-HOG features are highly promising for implementing the authentication module of the secure mobile Internet voting architecture we are proposing in this paper.


Sign in / Sign up

Export Citation Format

Share Document