scholarly journals Dual-Mic Speech Enhancement Based on TF-GSC with Leakage Suppression and Signal Recovery

2021 ◽  
Vol 11 (6) ◽  
pp. 2816
Author(s):  
Hansol Kim ◽  
Jong Won Shin

The transfer function-generalized sidelobe canceller (TF-GSC) is one of the most popular structures for the adaptive beamformer used in multi-channel speech enhancement. Although the TF-GSC has shown decent performance, a certain amount of steering error is inevitable, which causes leakage of speech components through the blocking matrix (BM) and distortion in the fixed beamformer (FBF) output. In this paper, we propose to suppress the leaked signal in the output of the BM and restore the desired signal in the FBF output of the TF-GSC. To reduce the risk of attenuating speech in the adaptive noise canceller (ANC), the speech component in the output of the BM is suppressed by applying a gain function similar to the square-root Wiener filter, assuming that a certain portion of the desired speech should be leaked into the BM output. Additionally, we propose to restore the attenuated desired signal in the FBF output by adding some of the microphone signal components back, depending on how microphone signals are related to the FBF and BM outputs. The experimental results showed that the proposed TF-GSC outperformed conventional TF-GSC in terms of the perceptual evaluation of speech quality (PESQ) scores under various noise conditions and the direction of arrivals for the desired and interfering sources.

Sensors ◽  
2021 ◽  
Vol 21 (5) ◽  
pp. 1878
Author(s):  
Yi Zhou ◽  
Haiping Wang ◽  
Yijing Chu ◽  
Hongqing Liu

The use of multiple spatially distributed microphones allows performing spatial filtering along with conventional temporal filtering, which can better reject the interference signals, leading to an overall improvement of the speech quality. In this paper, we propose a novel dual-microphone generalized sidelobe canceller (GSC) algorithm assisted by a bone-conduction (BC) sensor for speech enhancement, which is named BC-assisted GSC (BCA-GSC) algorithm. The BC sensor is relatively insensitive to the ambient noise compared to the conventional air-conduction (AC) microphone. Hence, BC speech can be analyzed to generate very accurate voice activity detection (VAD), even in a high noise environment. The proposed algorithm incorporates the VAD information obtained by the BC speech into the adaptive blocking matrix (ABM) and adaptive noise canceller (ANC) in GSC. By using VAD to control ABM and combining VAD with signal-to-interference ratio (SIR) to control ANC, the proposed method could suppress interferences and improve the overall performance of GSC significantly. It is verified by experiments that the proposed GSC system not only improves speech quality remarkably but also boosts speech intelligibility.


Author(s):  
Dima Shaheen ◽  
Oumayma Al Dakkak ◽  
Mohiedin Wainakh

Speech enhancement is one of the many challenging tasks in signal processing, especially in the case of nonstationary speech-like noise. In this paper a new incoherent discriminative dictionary learning algorithm is proposed to model both speech and noise, where the cost function accounts for both “source confusion” and “source distortion” errors, with a regularization term that penalizes the coherence between speech and noise sub-dictionaries. At the enhancement stage, we use sparse coding on the learnt dictionary to find an estimate for both clean speech and noise amplitude spectrum. In the final phase, the Wiener filter is used to refine the clean speech estimate. Experiments on the Noizeus dataset, using two objective speech enhancement measures: frequency-weighted segmental SNR and Perceptual Evaluation of Speech Quality (PESQ) demonstrate that the proposed algorithm outperforms other speech enhancement methods tested.


Sensors ◽  
2020 ◽  
Vol 20 (18) ◽  
pp. 5050
Author(s):  
Yi Zhou ◽  
Yufan Chen ◽  
Yongbao Ma ◽  
Hongqing Liu

The quality and intelligibility of the speech are usually impaired by the interference of background noise when using internet voice calls. To solve this problem in the context of wearable smart devices, this paper introduces a dual-microphone, bone-conduction (BC) sensor assisted beamformer and a simple recurrent unit (SRU)-based neural network postfilter for real-time speech enhancement. Assisted by the BC sensor, which is insensitive to the environmental noise compared to the regular air-conduction (AC) microphone, the accurate voice activity detection (VAD) can be obtained from the BC signal and incorporated into the adaptive noise canceller (ANC) and adaptive block matrix (ABM). The SRU-based postfilter consists of a recurrent neural network with a small number of parameters, which improves the computational efficiency. The sub-band signal processing is designed to compress the input features of the neural network, and the scale-invariant signal-to-distortion ratio (SI-SDR) is developed as the loss function to minimize the distortion of the desired speech signal. Experimental results demonstrate that the proposed real-time speech enhancement system provides significant speech sound quality and intelligibility improvements for all noise types and levels when compared with the AC-only beamformer with a postfiltering algorithm.


2015 ◽  
Vol 27 (5) ◽  
pp. 520-527 ◽  
Author(s):  
Ran Xiao ◽  
◽  
Yaping Ma ◽  
Boyan Huang ◽  
Yegui Xiao ◽  
...  

<div class=""abs_img""> <img src=""[disp_template_path]/JRM/abst-image/00270005/08.jpg"" width=""300"" /> Denoising nonlinear filter</div> Speech recovery in the presence of very harsh noise is calling for R&D that take approaches different from those established to date, as the conventional systems and algorithms for speech denoising may suffer from serious performance degradation. In this paper, we propose a hybrid nonlinear adaptive noise canceller (ANC) to perform the speech enhancement task using both bone- and air-conducted measurements. In the proposed ANC, the bone-conducted speech serves as a reference signal while the air-conducted measurement with very large additive noise is adopted as a primary noise. A Volterra filter and a functional link artificial neural network (FLANN) are placed in parallel, forming a hybrid nonlinear ANC. Simulations using real bone- and air-conducted speech measurements are provided to demonstrate that the proposed system outperforms ANCs equipped with FIR filter or Volterra filter or FLANN alone. </span>


Sensors ◽  
2020 ◽  
Vol 20 (20) ◽  
pp. 5751
Author(s):  
Seon Man Kim

This paper proposes a novel technique to improve a spectral statistical filter for speech enhancement, to be applied in wearable hearing devices such as hearing aids. The proposed method is implemented considering a 32-channel uniform polyphase discrete Fourier transform filter bank, for which the overall algorithm processing delay is 8 ms in accordance with the hearing device requirements. The proposed speech enhancement technique, which exploits the concepts of both non-negative sparse coding (NNSC) and spectral statistical filtering, provides an online unified framework to overcome the problem of residual noise in spectral statistical filters under noisy environments. First, the spectral gain attenuator of the statistical Wiener filter is obtained using the a priori signal-to-noise ratio (SNR) estimated through a decision-directed approach. Next, the spectrum estimated using the Wiener spectral gain attenuator is decomposed by applying the NNSC technique to the target speech and residual noise components. These components are used to develop an NNSC-based Wiener spectral gain attenuator to achieve enhanced speech. The performance of the proposed NNSC–Wiener filter was evaluated through a perceptual evaluation of the speech quality scores under various noise conditions with SNRs ranging from -5 to 20 dB. The results indicated that the proposed NNSC–Wiener filter can outperform the conventional Wiener filter and NNSC-based speech enhancement methods at all SNRs.


Sign in / Sign up

Export Citation Format

Share Document