Subband Analysis for Performance Improvement of Replay Attack Detection in Speaker Verification Systems

Author(s):  
Sachin Garg ◽  
Shruti Bhilare ◽  
Vivek Kanhangad
IEEE Access ◽  
2020 ◽  
Vol 8 ◽  
pp. 36080-36088 ◽  
Author(s):  
Sung-Hyun Yoon ◽  
Min-Sung Koh ◽  
Jae-Han Park ◽  
Ha-Jin Yu

2018 ◽  
Vol 179 (41) ◽  
pp. 44-48
Author(s):  
Pooja Anjee ◽  
Shubham Ghosh ◽  
Shrirag Kodoor ◽  
Rajashree Shettar

2020 ◽  
Vol 10 (18) ◽  
pp. 6292
Author(s):  
Hye-jin Shim ◽  
Jee-weon Jung ◽  
Ju-ho Kim ◽  
Ha-jin Yu

A number of studies have successfully developed speaker verification or presentation attack detection systems. However, studies integrating the two tasks remain in the preliminary stages. In this paper, we propose two approaches for building an integrated system of speaker verification and presentation attack detection: an end-to-end monolithic approach and a back-end modular approach. The first approach simultaneously trains speaker identification, presentation attack detection, and the integrated system using multi-task learning using a common feature. However, through experiments, we hypothesize that the information required for performing speaker verification and presentation attack detection might differ because speaker verification systems try to remove device-specific information from speaker embeddings, while presentation attack detection systems exploit such information. Therefore, we propose a back-end modular approach using a separate deep neural network (DNN) for speaker verification and presentation attack detection. This approach has thee input components: two speaker embeddings (for enrollment and test each) and prediction of presentation attacks. Experiments are conducted using the ASVspoof 2017-v2 dataset, which includes official trials on the integration of speaker verification and presentation attack detection. The proposed back-end approach demonstrates a relative improvement of 21.77% in terms of the equal error rate for integrated trials compared to a conventional speaker verification system.


Author(s):  
Choon Beng Tan ◽  
Mohd Hanafi Ahmad Hijazi ◽  
Norazlina Khamis ◽  
Puteri Nor Ellyza binti Nohuddin ◽  
Zuraini Zainol ◽  
...  

AbstractThe emergence of biometric technology provides enhanced security compared to the traditional identification and authentication techniques that were less efficient and secure. Despite the advantages brought by biometric technology, the existing biometric systems such as Automatic Speaker Verification (ASV) systems are weak against presentation attacks. A presentation attack is a spoofing attack launched to subvert an ASV system to gain access to the system. Though numerous Presentation Attack Detection (PAD) systems were reported in the literature, a systematic survey that describes the current state of research and application is unavailable. This paper presents a systematic analysis of the state-of-the-art voice PAD systems to promote further advancement in this area. The objectives of this paper are two folds: (i) to understand the nature of recent work on PAD systems, and (ii) to identify areas that require additional research. From the survey, a taxonomy of voice PAD and the trend analysis of recent work on PAD systems were built and presented, whereby the recent and relevant articles including articles from Interspeech and ICASSP Conferences, mostly indexed by Scopus, published between 2015 and 2021 were considered. A total of 172 articles were surveyed in this work. The findings of this survey present the limitation of recent works, which include spoof-type dependent PAD. Consequently, the future direction of work on voice PAD for interested researchers is established. The findings of this survey present the limitation of recent works, which include spoof-type dependent PAD. Consequently, the future direction of work on voice PAD for interested researchers is established.


Author(s):  
Khomdet Phapatanaburi ◽  
Prawit Buayai ◽  
Watcharaphon Naktong ◽  
Jakkree Srinonchat

Magnitude and phase aware deep neural network (MP aware DNN) based on Fast Fourier Transform information, has recently been received more attention to many speech applications. However, little attention has been paid to its aspect in terms of replay attack detection developed for the automatic speaker verification and countermeasures (ASVspoof 2017). This paper aims to investigate the MP aware DNN as a speech classification for detecting non-replayed (genuine) and replayed speech. Also, to exploit the advantage of the classifier-based complementary to improve the reliable detection decision, we propose a novel method by combining MP aware DNN with standard replay attack detection (that is, the use of constant Q transform cepstral coefficients-based Gaussian mixture model classification: CQCC-based GMM). Experiments are evaluated using ASVspoof 2017 and a standard measure of detection performance called equal error rate (EER). The results showed that MP aware DNN -based detection performed conventional DNN method using only the magnitude/phase features. Moreover, we found that score combination of CQCC-based GMM with MP aware DNN achieved additional improvement, indicating that MP aware DNN is very useful, especially when combined with the CQCC-based GMM for replay attack detection.


Sign in / Sign up

Export Citation Format

Share Document