scholarly journals An integrated MVDR beamformer for speech enhancement using a local microphone array and external microphones

Author(s):  
Randall Ali ◽  
Toon van Waterschoot ◽  
Marc Moonen

AbstractAn integrated version of the minimum variance distortionless response (MVDR) beamformer for speech enhancement using a microphone array has been recently developed, which merges the benefits of imposing constraints defined from both a relative transfer function (RTF) vector based on a priori knowledge and an RTF vector based on a data-dependent estimate. In this paper, the integrated MVDR beamformer is extended for use with a microphone configuration where a microphone array, local to a speech processing device, has access to the signals from multiple external microphones (XMs) randomly located in the acoustic environment. The integrated MVDR beamformer is reformulated as a quadratically constrained quadratic program (QCQP) with two constraints, one of which is related to the maximum tolerable speech distortion for the imposition of the a priori RTF vector and the other related to the maximum tolerable speech distortion for the imposition of the data-dependent RTF vector. An analysis of how these maximum tolerable speech distortions affect the behaviour of the QCQP is presented, followed by the discussion of a general tuning framework. The integrated MVDR beamformer is then evaluated with audio recordings from behind-the-ear hearing aid microphones and three XMs for a single desired speech source in a noisy environment. In comparison to relying solely on an a priori RTF vector or a data-dependent RTF vector, the results demonstrate that the integrated MVDR beamformer can be tuned to yield different enhanced speech signals, which may be more suitable for improving speech intelligibility despite changes in the desired speech source position and imperfectly estimated spatial correlation matrices.

2013 ◽  
Vol 321-324 ◽  
pp. 1075-1079
Author(s):  
Peng Liu ◽  
Jian Fen Ma

A higher intelligibility speech-enhancement algorithm based on subspace is proposed. The majority existing speech-enhancement algorithms cannot effectively improve enhanced speech intelligibility. One important reason is that they only use Minimum Mean Square Error (MMSE) to constrain speech distortion but ignore that speech distortion region differences have a significant effect on intelligibility. A priori Signal Noise Ratio (SNR) and gain matrix were used to determine the distortion region. Then the gain matrix was modified to constrain the magnitude spectrum of the amplification distortion in excess of 6.02 dB which damages intelligibility much. Both objective evaluation and subjective audition show that the proposed algorithm does improve the enhanced speech intelligibility.


2021 ◽  
pp. 2150022
Author(s):  
Caio Cesar Enside de Abreu ◽  
Marco Aparecido Queiroz Duarte ◽  
Bruno Rodrigues de Oliveira ◽  
Jozue Vieira Filho ◽  
Francisco Villarreal

Speech processing systems are very important in different applications involving speech and voice quality such as automatic speech recognition, forensic phonetics and speech enhancement, among others. In most of them, the acoustic environmental noise is added to the original signal, decreasing the signal-to-noise ratio (SNR) and the speech quality by consequence. Therefore, estimating noise is one of the most important steps in speech processing whether to reduce it before processing or to design robust algorithms. In this paper, a new approach to estimate noise from speech signals is presented and its effectiveness is tested in the speech enhancement context. For this purpose, partial least squares (PLS) regression is used to model the acoustic environment (AE) and a Wiener filter based on a priori SNR estimation is implemented to evaluate the proposed approach. Six noise types are used to create seven acoustically modeled noises. The basic idea is to consider the AE model to identify the noise type and estimate its power to be used in a speech processing system. Speech signals processed using the proposed method and classical noise estimators are evaluated through objective measures. Results show that the proposed method produces better speech quality than state-of-the-art noise estimators, enabling it to be used in real-time applications in the field of robotic, telecommunications and acoustic analysis.


2020 ◽  
Vol 8 (5) ◽  
pp. 1635-1637

In this work, the author introduces a new technique for improving the performance of minimum variance distortionless response filter in condition of coherent noise. The proposal algorithm exploits a priori information of differences amplitude to balance power spectral densities of observed noisy signals. The output signal of MVDR filter is then processed by an additional post-filtering, which based speech presence probability to suppress more noise interference and increase quality speech. In experiments using two noisy signal recordings in anechoeic room, the modified MVDR-filter results provides that the suggested algorithm increases speech quality compared to the conventional MVDR filter.


This paper introduces technology to improve sound quality, which serves the needs of media and entertainment. Major challenging problem in the speech processing applications like mobile phones, hands-free phones, car communication, teleconference systems, hearing aids, voice coders, automatic speech recognition and forensics etc., is to eliminate the background noise. Speech enhancement algorithms are widely used for these applications in order to remove the noise from degraded speech in the noisy environment. Hence, the conventional noise reduction methods introduce more residual noise and speech distortion. So, it has been found that the noise reduction process is more effective to improve the speech quality but it affects the intelligibility of the clean speech signal. In this paper, we introduce a new model of coherence-based noise reduction method for the complex noise environment in which a target speech coexists with a coherent noise around. From the coherence model, the information of speech presence probability is added to better track noise variation accurately; and during the speech presence and speech absent period, adaptive coherence-based method is adjusted. The performance of suggested method is evaluated in condition of diffuse and real street noise, and it improves the speech signal quality less speech distortion and residual noise.


Author(s):  
Shifeng Ou ◽  
Peng Song ◽  
Ying Gao

The a priori signal-to-noise ratio (SNR) plays an essential role in many speech enhancement systems. Most of the existing approaches to estimate the a priori SNR only exploit the amplitude spectra while making the phase neglected. Considering the fact that incorporating phase information into a speech processing system can significantly improve the speech quality, this paper proposes a phase-sensitive decision-directed (DD) approach for the a priori SNR estimate. By representing the short-time discrete Fourier transform (STFT) signal spectra geometrically in a complex plane, the proposed approach estimates the a priori SNR using both the magnitude and phase information while making no assumptions about the phase difference between clean speech and noise spectra. Objective evaluations in terms of the spectrograms, segmental SNR, log-spectral distance (LSD) and short-time objective intelligibility (STOI) measures are presented to demonstrate the superiority of the proposed approach compared to several competitive methods at different noise conditions and input SNR levels.


2021 ◽  
Vol 25 ◽  
pp. 233121652110059
Author(s):  
Ayham Zedan ◽  
Tim Jürgens ◽  
Ben Williges ◽  
Birger Kollmeier ◽  
Konstantin Wiebe ◽  
...  

This study investigated the speech intelligibility benefit of using two different spatial noise reduction algorithms in cochlear implant (CI) users who use a hearing aid (HA) on the contralateral side (bimodal CI users). The study controlled for head movements by using head-related impulse responses to simulate a realistic cafeteria scenario and controlled for HA and CI manufacturer differences by using the master hearing aid platform (MHA) to apply both hearing loss compensation and the noise reduction algorithms (beamformers). Ten bimodal CI users with moderate to severe hearing loss contralateral to their CI participated in the study, and data from nine listeners were included in the data analysis. The beamformers evaluated were the adaptive differential microphones (ADM) implemented independently on each side of the listener and the (binaurally implemented) minimum variance distortionless response (MVDR). For frontal speech and stationary noise from either left or right, an improvement (reduction) of the speech reception threshold of 5.4 dB and 5.5 dB was observed using the ADM, and 6.4 dB and 7.0 dB using the MVDR, respectively. As expected, no improvement was observed for either algorithm for colocated speech and noise. In a 20-talker babble noise scenario, the benefit observed was 3.5 dB for ADM and 7.5 dB for MVDR. The binaural MVDR algorithm outperformed the bilaterally applied monaural ADM. These results encourage the use of beamformer algorithms such as the ADM and MVDR by bimodal CI users in everyday life scenarios.


2019 ◽  
Vol 8 (2) ◽  
pp. 4708-4712

One of the important features of Speech processing is speech enhancement. In a noisy environment, speech enhancement plays a vital role. Many research works are being done in speech enhancement methods in recent years but still, it can't be attained. It mainly depends on Speech intelligibility which can improve the speech quality. In this research work, signal representation is considered and the various transforms are applied and compared. The analysis is done with the help of two parameters and the results are compared. Here the enhancement process is focused on using Advanced DCT (ADCT) and Discrete fractional Cosine transform. The ADCT has the advantage of energy compaction and flexible window switching. Iterative Wiener Filtering is used for filtering the coefficients. Pitch Synchronous Analysis (PSA) is combined for finding the exact pitch period.


2020 ◽  
Vol 24 ◽  
pp. 233121652091957
Author(s):  
Nico Gößling ◽  
Daniel Marquardt ◽  
Simon Doclo

Besides improving speech intelligibility in background noise, another important objective of noise reduction algorithms for binaural hearing devices is preserving the spatial impression for the listener. In this study, we evaluate the performance of several recently proposed noise reduction algorithms based on the binaural minimum-variance-distortionless-response (MVDR) beamformer, which trade-off between noise reduction performance and preservation of the interaural coherence (IC) for diffuse noise fields. Aiming at a perceptually optimized result, this trade-off is determined based on the IC discrimination ability of the human auditory system. The algorithms are evaluated with normal-hearing participants for an anechoic scenario and a reverberant cafeteria scenario, in terms of both speech intelligibility using a matrix sentence test and spatial quality using a MUlti Stimulus test with Hidden Reference and Anchor (MUSHRA). The results show that all the binaural noise reduction algorithms are able to improve speech intelligibility compared with the unprocessed microphone signals, where partially preserving the IC of the diffuse noise field leads to a significant improvement in perceived spatial quality compared with the binaural MVDR beamformer while hardly affecting speech intelligibility.


Sign in / Sign up

Export Citation Format

Share Document