Automatic Speaker Recognition System

1984 ◽  
Author(s):  
Alan Higgins ◽  
Joe Naylor
Author(s):  
Tumisho Billson Mokgonyane ◽  
Tshephisho Joseph Sefara ◽  
Thipe Isaiah Modipa ◽  
Madimetja Jonas Manamela

Author(s):  
Satyanand Singh

This paper proposes state-of the-art Automatic Speaker Recognition System (ASR) based on Bayesian Distance Learning Metric as a feature extractor. In this modeling, I explored the constraints of the distance between modified and simplified i-vector pairs by the same speaker and different speakers. An approximation of the distance metric is used as a weighted covariance matrix from the higher eigenvectors of the covariance matrix, which is used to estimate the posterior distribution of the metric distance. Given a speaker tag, I select the data pair of the different speakers with the highest cosine score to form a set of speaker constraints. This collection captures the most discriminating variability between the speakers in the training data. This Bayesian distance learning approach achieves better performance than the most advanced methods. Furthermore, this method is insensitive to normalization compared to cosine scores. This method is very effective in the case of limited training data. The modified supervised i-vector based ASR system is evaluated on the NIST SRE 2008 database. The best performance of the combined cosine score EER 1.767% obtained using LDA200 + NCA200 + LDA200, and the best performance of Bayes_dml EER 1.775% obtained using LDA200 + NCA200 + LDA100. Bayesian_dml overcomes the combined norm of cosine scores and is the best result of the short2-short3 condition report for NIST SRE 2008 data.


Author(s):  
P. Chakraborty ◽  
F. Ahmed ◽  
Md. Monirul Kabir ◽  
Md. Shahjahan ◽  
Kazuyuki Murase

Author(s):  
Fadwa Abakarim ◽  
Abdenbi Abenaou

In this research, we present an automatic speaker recognition system based on adaptive orthogonal transformations. To obtain the informative features with a minimum dimension from the input signals, we created an adaptive operator, which helped to identify the speaker’s voice in a fast and efficient manner. We test the efficiency and the performance of our method by comparing it with another approach, mel-frequency cepstral coefficients (MFCCs), which is widely used by researchers as their feature extraction method. The experimental results show the importance of creating the adaptive operator, which gives added value to the proposed approach. The performance of the system achieved 96.8% accuracy using Fourier transform as a compression method and 98.1% using Correlation as a compression method.


2002 ◽  
Author(s):  
◽  
Viresh Moonasar

The use of speaker recognition technology in interactive voice response and electronic commerce systems has been limited. This is due to the lack of research attention and published results when compared to all the other areas of speech recognition technologies


Sign in / Sign up

Export Citation Format

Share Document