Automatic Speaker Recognition by Speech Signal

One extension of feature vector for automatic speaker recognition is considered in this paper. The starting feature vector consisted of 18 mel-frequency cepstral coefficients (MFCCs). Extension was done with two additional features derived from the spectrum of the speech signal. The main idea that generated this research is that it is possible to increase the efficiency of automatic speaker recognition by constructing a feature vector which tracks a real perceived spectrum in the observed speech. Additional features are based on the energy maximums in the appropriate frequency ranges of observed speech frames. In experiments, accuracy and equal error rate (EER) are compared in the case when feature vectors contain only 18 MFCCs and in cases when additional features are used. Recognition accuracy increased by around 3%. Values of EER show smaller differentiation but the results show that adding proposed additional features produced a lower decision threshold. These results indicate that tracking of real occurrences in the spectrum of the speech signal leads to more efficient automatic speaker recognizer. Determining features which track real occurrences in the speech spectrum will improve the procedure of automatic speaker recognition and enable avoiding complex models.

Download Full-text

Performance of speaker recognition system using shifted mfcc, delta spectral cepstral coefficient (DSCC) and Fuzzy techniques

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i2.8.10424 ◽

2018 ◽

Vol 7 (2.8) ◽

pp. 278

Author(s):

Priyanka Bansal ◽

Syed Akhtar Imam

Keyword(s):

Real World ◽

Speaker Recognition ◽

Speech Signal ◽

Background Noise ◽

Recognition System ◽

Fuzzy Modeling ◽

Crosstalk Noise ◽

Noise Interference ◽

Automatic Speaker Recognition ◽

Fuzzy Techniques

Speech and speaker recognition systems are biometric inspired systems which are having scope in various online and offline applications. In case of biometric we ponder the variability of speech signal due to the presence of noise which greatly degrades the efficiency of Automatic Speaker Recognition (ASR) in real-world environmental circumstances. Real world speech signal is degraded by different types of noise signals like background noise, interference noise and crosstalk noise. In this paper, we have used Delta Spectrum Cepstrum Coefficient (DSCC) and Shifted MFCC with fuzzy modeling techniques to rectify the deed of ASR even in a noisy surrounding with the help of upgraded speech information which is present at high frequency in the spectral domain. The combination of fuzzy modeling and DSCC creates a firm cumulative algorithm which has reasonably high robustness to noise. Experimental results show that accuracy has enhanced by 10-20% even at 5-8dB SNR in the presence of background noise or turbulent environmental condition or in the presence of white noise.Thus proposed model has improved maturity level in comparison to obsolete methods.

Download Full-text