An Empirical Study of Speaker Identification System for Mono and Traverse Linguistic Background Using EM and SMEM

Author(s):  
Amita Dev ◽  
Shweta A. Bansal ◽  
Shyam S. Agrawal
2020 ◽  
pp. 65-72
Author(s):  
V. V. Savchenko ◽  
A. V. Savchenko

This paper is devoted to the presence of distortions in a speech signal transmitted over a communication channel to a biometric system during voice-based remote identification. We propose to preliminary correct the frequency spectrum of the received signal based on the pre-distortion principle. Taking into account a priori uncertainty, a new information indicator of speech signal distortions and a method for measuring it in conditions of small samples of observations are proposed. An example of fast practical implementation of the method based on a parametric spectral analysis algorithm is considered. Experimental results of our approach are provided for three different versions of communication channel. It is shown that the usage of the proposed method makes it possible to transform the initially distorted speech signal into compliance on the registered voice template by using acceptable information discrimination criterion. It is demonstrated that our approach may be used in existing biometric systems and technologies of speaker identification.


Author(s):  
A. Nagesh

The feature vectors of speaker identification system plays a crucial role in the overall performance of the system. There are many new feature vectors extraction methods based on MFCC, but ultimately we want to maximize the performance of SID system.  The objective of this paper to derive Gammatone Frequency Cepstral Coefficients (GFCC) based a new set of feature vectors using Gaussian Mixer model (GMM) for speaker identification. The MFCC are the default feature vectors for speaker recognition, but they are not very robust at the presence of additive noise. The GFCC features in recent studies have shown very good robustness against noise and acoustic change. The main idea is  GFCC features based on GMM feature extraction is to improve the overall speaker identification performance in low signal to noise ratio (SNR) conditions.


Author(s):  
Suliman S. Al-Dahri ◽  
Youssaf H. Al-Jassar ◽  
Yousef A. Alotaibi ◽  
Mansour M. Alsulaiman ◽  
Khondaker Abdullah-Al-Mamun

The performance of Mel scale and Bark scale is evaluated for text-independent speaker identification system. Mel scale and Bark scale are designed according to human auditory system. The filter bank structure is defined using Mel and Bark scales for speech and speaker recognition systems to extract speaker specific speech features. In this work, performance of Mel scale and Bark scale is evaluated for text-independent speaker identification system. It is found that Bark scale centre frequencies are more effective than Mel scale centre frequencies in case of Indian dialect speaker databases. Mel scale is defined as per interpretation of pitch by human ear and Bark scale is based on critical band selectivity at which loudness becomes significantly different. The recognition rate achieved using Bark scale filter bank is 96% for AISSMSIOIT database and 95% for Marathi database.


Author(s):  
Musab T. S. Al-Kaltakchi ◽  
Haithem Abd Al-Raheem Taha ◽  
Mohanad Abd Shehab ◽  
Mohamed A.M. Abdullah

<p><span lang="EN-GB">In this paper, different feature extraction and feature normalization methods are investigated for speaker recognition. With a view to give a good representation of acoustic speech signals, Power Normalized Cepstral Coefficients (PNCCs) and Mel Frequency Cepstral Coefficients (MFCCs) are employed for feature extraction. Then, to mitigate the effect of linear channel, Cepstral Mean-Variance Normalization (CMVN) and feature warping are utilized. The current paper investigates Text-independent speaker identification system by using 16 coefficients from both the MFCCs and PNCCs features. Eight different speakers are selected from the GRID-Audiovisual database with two females and six males. The speakers are modeled using the coupling between the Universal Background Model and Gaussian Mixture Models (GMM-UBM) in order to get a fast scoring technique and better performance. The system shows 100% in terms of speaker identification accuracy. The results illustrated that PNCCs features have better performance compared to the MFCCs features to identify females compared to male speakers. Furthermore, feature wrapping reported better performance compared to the CMVN method. </span></p>


Sign in / Sign up

Export Citation Format

Share Document