Voice Activity Detection Algorithm Using Zero Frequency Filter Assisted Peaking Resonator and Empirical Mode Decomposition

2013 ◽  
Vol 22 (3) ◽  
pp. 269-282
Author(s):  
M.S. Rudramurthy ◽  
V. Kamakshi Prasad ◽  
R. Kumaraswamy

AbstractIn this article, a new adaptive data-driven strategy for voice activity detection (VAD) using empirical mode decomposition (EMD) is proposed. Speech data are decomposed using an a posteriori, adaptive, data-driven EMD in the time domain to yield a set of physically meaningful intrinsic mode functions (IMFs). Each IMF preserves the nonlinear and nonstationary property of the speech utterance. Among a set of IMFs, the IMF that contains source information dominantly called characteristic IMF (CIMF) can be identified and extracted by designing a zero-frequency filter-assisted peaking resonator. The detected CIMF is used to compute energy using short-term processing. Choosing proper threshold, voiced regions in speech utterances are detected using frame energy. The proposed framework has been studied on both clean speech utterance and noisy speech utterance (0-dB white noise). The proposed method is used for voice activity detection (VAD) in the presence of white noise and shows encouraging result in the presence of white noise up to 0 dB.

2012 ◽  
Vol 198-199 ◽  
pp. 1560-1566
Author(s):  
Wen Lian Zhan ◽  
Jing Fang Wang

Hilbert-Huang transform is developed in recent years dealing with nonlinear, non-stationary signal analysis of the complete local time-frequency method, recurrence plot method is a recursive nonlinear dynamic behavior of time series method of reconstruction. In this paper, Hilbert-Huang Transform empirical mode decomposition (EMD) and the recurrence plot (RP) method, a new voice activity detection algorithm. Firstly, through the speech and noise based on the empirical mode decomposition and multi-scale features of the different intrinsic mode function (IMF) on a time scale filtering and nonlinear dynamic behavior of the recurrence plot method, quantitative Recursive analysis of statistical uncertainty for endpoint detection. Simulation results show that the method has a strong non-steady-state dynamic analysis capabilities, in low SNR environment more accurately than the traditional method to extract the start and end point of the speech signal, robustness.


Sign in / Sign up

Export Citation Format

Share Document