POLYCOST: A telephone-speech database for speaker recognition

2000 ◽  
Vol 31 (2-3) ◽  
pp. 265-270 ◽  
Author(s):  
J Hennebert ◽  
H Melin ◽  
D Petrovska ◽  
D Genoud
2017 ◽  
Vol 9 (3) ◽  
pp. 53 ◽  
Author(s):  
Pardeep Sangwan ◽  
Saurabh Bhardwaj

<p>Speaker recognition systems are classified according to their database, feature extraction techniques and classification methods. It is analyzed that there is a much need to work upon all the dimensions of forensic speaker recognition systems from the very beginning phase of database collection to recognition phase. The present work provides a structured approach towards developing a robust speech database collection for efficient speaker recognition system. The database required for both systems is entirely different. The databases for biometric systems are readily available while databases for forensic speaker recognition system are scarce. The paper also presents several databases available for speaker recognition systems.</p><p> </p>


2017 ◽  
Vol 2017 ◽  
pp. 1-9 ◽  
Author(s):  
Lei Lei ◽  
She Kun

Today, more and more people have benefited from the speaker recognition. However, the accuracy of speaker recognition often drops off rapidly because of the low-quality speech and noise. This paper proposed a new speaker recognition model based on wavelet packet entropy (WPE), i-vector, and cosine distance scoring (CDS). In the proposed model, WPE transforms the speeches into short-term spectrum feature vectors (short vectors) and resists the noise. I-vector is generated from those short vectors and characterizes speech to improve the recognition accuracy. CDS fast compares with the difference between two i-vectors to give out the recognition result. The proposed model is evaluated by TIMIT speech database. The results of the experiments show that the proposed model can obtain good performance in clear and noisy environment and be insensitive to the low-quality speech, but the time cost of the model is high. To reduce the time cost, the parallel computation is used.


2017 ◽  
Vol 2017 ◽  
pp. 1-6 ◽  
Author(s):  
Mohammed Algabri ◽  
Hassan Mathkour ◽  
Mohamed A. Bencherif ◽  
Mansour Alsulaiman ◽  
Mohamed A. Mekhtiche

Presently, lawyers, law enforcement agencies, and judges in courts use speech and other biometric features to recognize suspects. In general, speaker recognition is used for discriminating people based on their voices. The process of determining, if a suspected speaker is the source of trace, is called forensic speaker recognition. In such applications, the voice samples are most probably noisy, the recording sessions might mismatch each other, the sessions might not contain sufficient recording for recognition purposes, and the suspect voices are recorded through mobile channel. The identification of a person through his voice within a forensic quality context is challenging. In this paper, we propose a method for forensic speaker recognition for the Arabic language; the King Saud University Arabic Speech Database is used for obtaining experimental results. The advantage of this database is that each speaker’s voice is recorded in both clean and noisy environments, through a microphone and a mobile channel. This diversity facilitates its usage in forensic experimentations. Mel-Frequency Cepstral Coefficients are used for feature extraction and the Gaussian mixture model-universal background model is used for speaker modeling. Our approach has shown low equal error rates (EER), within noisy environments and with very short test samples.


Sign in / Sign up

Export Citation Format

Share Document