Discriminative Training for Multiple Observation Likelihood Ratio Based Voice Activity Detection

2010 ◽  
Vol 17 (11) ◽  
pp. 897-900 ◽  
Author(s):  
Tao Yu ◽  
John H L Hansen
2020 ◽  
Vol 10 (15) ◽  
pp. 5026
Author(s):  
Seon Man Kim

This paper proposes a technique for improving statistical-model-based voice activity detection (VAD) in noisy environments to be applied in an auditory hearing aid. The proposed method is implemented for a uniform polyphase discrete Fourier transform filter bank satisfying an auditory device time latency of 8 ms. The proposed VAD technique provides an online unified framework to overcome the frequent false rejection of the statistical-model-based likelihood-ratio test (LRT) in noisy environments. The method is based on the observation that the sparseness of speech and background noise cause high false-rejection error rates in statistical LRT-based VAD—the false rejection rate increases as the sparseness increases. We demonstrate that the false-rejection error rate can be reduced by incorporating likelihood-ratio order statistics into a conventional LRT VAD. We confirm experimentally that the proposed method relatively reduces the average detection error rate by 15.8% compared to a conventional VAD with only minimal change in the false acceptance probability for three different noise conditions whose signal-to-noise ratio ranges from 0 to 20 dB.


Sign in / Sign up

Export Citation Format

Share Document