An Empirical Study of Speaker Identification System for Mono and Traverse Linguistic Background Using EM and SMEM

This paper is devoted to the presence of distortions in a speech signal transmitted over a communication channel to a biometric system during voice-based remote identification. We propose to preliminary correct the frequency spectrum of the received signal based on the pre-distortion principle. Taking into account a priori uncertainty, a new information indicator of speech signal distortions and a method for measuring it in conditions of small samples of observations are proposed. An example of fast practical implementation of the method based on a parametric spectral analysis algorithm is considered. Experimental results of our approach are provided for three different versions of communication channel. It is shown that the usage of the proposed method makes it possible to transform the initially distorted speech signal into compliance on the registered voice template by using acceptable information discrimination criterion. It is demonstrated that our approach may be used in existing biometric systems and technologies of speaker identification.

Download Full-text

New Feature Vectors using GFCC for Speaker Identification

International Journal of Emerging Research in Management and Technology ◽

10.23956/ijermt.v6i8.146 ◽

2018 ◽

Vol 6 (8) ◽

pp. 243

Author(s):

A. Nagesh

Keyword(s):

Speaker Recognition ◽

Speaker Identification ◽

Signal To Noise Ratio ◽

Main Idea ◽

Extraction Methods ◽

Identification System ◽

Identification Performance ◽

Feature Vectors ◽

Overall Performance ◽

New Feature

The feature vectors of speaker identification system plays a crucial role in the overall performance of the system. There are many new feature vectors extraction methods based on MFCC, but ultimately we want to maximize the performance of SID system. The objective of this paper to derive Gammatone Frequency Cepstral Coefficients (GFCC) based a new set of feature vectors using Gaussian Mixer model (GMM) for speaker identification. The MFCC are the default feature vectors for speaker recognition, but they are not very robust at the presence of additive noise. The GFCC features in recent studies have shown very good robustness against noise and acoustic change. The main idea is GFCC features based on GMM feature extraction is to improve the overall speaker identification performance in low signal to noise ratio (SNR) conditions.

Download Full-text

An End-to-End Text-Independent Speaker Identification System on Short Utterances

10.21437/interspeech.2018-1058 ◽

2018 ◽

Cited By ~ 2

Author(s):

Ruifang Ji ◽

Xinyuan Cai ◽

Xu Bo

Keyword(s):

Speaker Identification ◽

Identification System ◽

End To End ◽

Short Utterances

Download Full-text

Short Utterance Based Speaker Identification System For Resource Constrained Devices

2018 2nd International Conference on Micro-Electronics and Telecommunication Engineering (ICMETE) ◽

10.1109/icmete.2018.00061 ◽

2018 ◽

Author(s):

Sanghamitra V. Arora ◽

Rekha Vig

Keyword(s):

Speaker Identification ◽

Identification System ◽

Resource Constrained ◽

Resource Constrained Devices ◽

Short Utterance ◽

Constrained Devices

Download Full-text

Robust speaker identification system using multi-band dominant features with empirical mode decomposition

2007 10th International Conference on Computer and Information Technology ◽

10.1109/iccitechn.2007.4579395 ◽

2007 ◽

Author(s):

Md. Khademul Islam Molla ◽

Keikichi Hirose ◽

Nobuaki Minematsu

Keyword(s):

Empirical Mode Decomposition ◽

Speaker Identification ◽

Identification System ◽

Mode Decomposition ◽

Robust Speaker Identification ◽

Multi Band

Download Full-text

A Word-Dependent Automatic Arabic Speaker Identification System

2008 IEEE International Symposium on Signal Processing and Information Technology ◽

10.1109/isspit.2008.4775669 ◽

2008 ◽

Cited By ~ 9

Author(s):

Suliman S. Al-Dahri ◽

Youssaf H. Al-Jassar ◽

Yousef A. Alotaibi ◽

Mansour M. Alsulaiman ◽

Khondaker Abdullah-Al-Mamun

Keyword(s):

Speaker Identification ◽

Identification System ◽

Arabic Speaker

Download Full-text

Improving speaker identification system using discrete wavelet transform and AWGN

2014 IEEE 5th International Conference on Software Engineering and Service Science ◽

10.1109/icsess.2014.6933775 ◽

2014 ◽

Cited By ~ 3

Author(s):

Heba Maged ◽

Ahmed AbouEl-Farag ◽

Saleh Mesbah

Keyword(s):

Wavelet Transform ◽

Discrete Wavelet Transform ◽

Speaker Identification ◽

Identification System ◽

Discrete Wavelet

Download Full-text

Multimodal biometric identification system for mobile robots combining human metrology to face recognition and speaker identification

The 23rd IEEE International Symposium on Robot and Human Interactive Communication ◽

10.1109/roman.2014.6926273 ◽

2014 ◽

Cited By ~ 1

Author(s):

Simon Ouellet ◽

Francois Grondin ◽

Francis Leconte ◽

Francois Michaud

Keyword(s):

Face Recognition ◽

Mobile Robots ◽

Speaker Identification ◽

Biometric Identification ◽

Identification System

Download Full-text

Performance Evaluation of Mel and Bark Scale based Features for Text-Independent Speaker Identification

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.k1999.0981119 ◽

2019 ◽

Vol 8 (11) ◽

pp. 3734-3738

Keyword(s):

Speaker Recognition ◽

Filter Bank ◽

Work Performance ◽

Speaker Identification ◽

Recognition Rate ◽

Identification System ◽

Bank Structure ◽

Speech Features ◽

Recognition Systems ◽

Human Ear

The performance of Mel scale and Bark scale is evaluated for text-independent speaker identification system. Mel scale and Bark scale are designed according to human auditory system. The filter bank structure is defined using Mel and Bark scales for speech and speaker recognition systems to extract speaker specific speech features. In this work, performance of Mel scale and Bark scale is evaluated for text-independent speaker identification system. It is found that Bark scale centre frequencies are more effective than Mel scale centre frequencies in case of Indian dialect speaker databases. Mel scale is defined as per interpretation of pitch by human ear and Bark scale is based on critical band selectivity at which loudness becomes significantly different. The recognition rate achieved using Bark scale filter bank is 96% for AISSMSIOIT database and 95% for Marathi database.

Download Full-text

Comparison of feature extraction and normalization methods for speaker recognition using grid-audiovisual database

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v18.i2.pp782-789 ◽

2020 ◽

Vol 18 (2) ◽

pp. 782

Author(s):

Musab T. S. Al-Kaltakchi ◽

Haithem Abd Al-Raheem Taha ◽

Mohanad Abd Shehab ◽

Mohamed A.M. Abdullah

Keyword(s):

Feature Extraction ◽

Speaker Recognition ◽

Speaker Identification ◽

Gaussian Mixture ◽

Identification Accuracy ◽

Identification System ◽

Good Representation ◽

Mel Frequency Cepstral Coefficients ◽

Normalization Methods ◽

Cepstral Coefficients

<p><span lang="EN-GB">In this paper, different feature extraction and feature normalization methods are investigated for speaker recognition. With a view to give a good representation of acoustic speech signals, Power Normalized Cepstral Coefficients (PNCCs) and Mel Frequency Cepstral Coefficients (MFCCs) are employed for feature extraction. Then, to mitigate the effect of linear channel, Cepstral Mean-Variance Normalization (CMVN) and feature warping are utilized. The current paper investigates Text-independent speaker identification system by using 16 coefficients from both the MFCCs and PNCCs features. Eight different speakers are selected from the GRID-Audiovisual database with two females and six males. The speakers are modeled using the coupling between the Universal Background Model and Gaussian Mixture Models (GMM-UBM) in order to get a fast scoring technique and better performance. The system shows 100% in terms of speaker identification accuracy. The results illustrated that PNCCs features have better performance compared to the MFCCs features to identify females compared to male speakers. Furthermore, feature wrapping reported better performance compared to the CMVN method. </span></p>

Download Full-text