scholarly journals Voice Activity Detection Based on High Order Statistics and Online EM Algorithm

2008 ◽  
Vol E91-D (12) ◽  
pp. 2854-2861 ◽  
Author(s):  
D. COURNAPEAU ◽  
T. KAWAHARA
2014 ◽  
Vol 624 ◽  
pp. 495-499
Author(s):  
Qiang Li ◽  
Hong En Xie ◽  
Qiu Ju Zheng

The voice activity detection is one of the key technologies of variable rate speech coding. The development of speech coding technology requires higher performance of the detection. Based on the analysis of spectral entropy and high-order statistics of the basic definition and property of the foundation, this article proposes a voice activity detection algorithm which combines spectral entropy with high-order statistics. The algorithm can effectively detect the speech and non-speech segments, and can get reasonable results in a complex background noise environment.


2020 ◽  
Vol 10 (15) ◽  
pp. 5026
Author(s):  
Seon Man Kim

This paper proposes a technique for improving statistical-model-based voice activity detection (VAD) in noisy environments to be applied in an auditory hearing aid. The proposed method is implemented for a uniform polyphase discrete Fourier transform filter bank satisfying an auditory device time latency of 8 ms. The proposed VAD technique provides an online unified framework to overcome the frequent false rejection of the statistical-model-based likelihood-ratio test (LRT) in noisy environments. The method is based on the observation that the sparseness of speech and background noise cause high false-rejection error rates in statistical LRT-based VAD—the false rejection rate increases as the sparseness increases. We demonstrate that the false-rejection error rate can be reduced by incorporating likelihood-ratio order statistics into a conventional LRT VAD. We confirm experimentally that the proposed method relatively reduces the average detection error rate by 15.8% compared to a conventional VAD with only minimal change in the false acceptance probability for three different noise conditions whose signal-to-noise ratio ranges from 0 to 20 dB.


Sign in / Sign up

Export Citation Format

Share Document