Speech Intelligibility Enhancement Using Distortion Control

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.912-914.1391 ◽

2014 ◽

Vol 912-914 ◽

pp. 1391-1394

Author(s):

Yu Xiang Yang ◽

Jian Fen Ma

Keyword(s):

Noise Reduction ◽

Speech Enhancement ◽

Speech Intelligibility ◽

Signal Distortion ◽

Gain Function ◽

Noisy Speech ◽

Speech Distortion ◽

Distortion Control ◽

Enhancement Algorithm

In order to improve the intelligibility of the noisy speech, a novel speech enhancement algorithm using distortion control is proposed. The reason why current speech enhancement algorithm cannot improve speech intelligibility is that these algorithms aim to minimize the overall distortion of the enhanced speech. However, different speech distortions make different contributions to the speech intelligibility. The distortion in excess of 6.02dB has the most detrimental effects on speech intelligibility. In the process of noise reduction, the type of speech distortion can be determined by signal distortion ratio. The distortion in excess of 6.02dB can be properly controlled via tuning the gain function of the speech enhancement algorithm. The experiment results show that the proposed algorithm can improve the intelligibility of the noisy speech considerably.

Download Full-text

A Higher Intelligibility Speech-Enhancement Algorithm

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.321-324.1075 ◽

2013 ◽

Vol 321-324 ◽

pp. 1075-1079

Author(s):

Peng Liu ◽

Jian Fen Ma

Keyword(s):

Speech Enhancement ◽

Speech Intelligibility ◽

Minimum Mean Square Error ◽

A Priori ◽

Objective Evaluation ◽

Mean Square ◽

Magnitude Spectrum ◽

Gain Matrix ◽

Speech Distortion ◽

Enhancement Algorithm

A higher intelligibility speech-enhancement algorithm based on subspace is proposed. The majority existing speech-enhancement algorithms cannot effectively improve enhanced speech intelligibility. One important reason is that they only use Minimum Mean Square Error (MMSE) to constrain speech distortion but ignore that speech distortion region differences have a significant effect on intelligibility. A priori Signal Noise Ratio (SNR) and gain matrix were used to determine the distortion region. Then the gain matrix was modified to constrain the magnitude spectrum of the amplification distortion in excess of 6.02 dB which damages intelligibility much. Both objective evaluation and subjective audition show that the proposed algorithm does improve the enhanced speech intelligibility.

Download Full-text

An Improved Speech Enhancement Algorithm Based on Wiener-Filtering

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.989-994.2565 ◽

2014 ◽

Vol 989-994 ◽

pp. 2565-2568

Author(s):

Yu Hong Liu ◽

Dong Mei Zhou ◽

Jing Di

Keyword(s):

Real Time ◽

Speech Enhancement ◽

Moving Average ◽

Wiener Filtering ◽

Signal Distortion ◽

Noise Elimination ◽

Speech Distortion ◽

Spectrum Shaping ◽

Enhancement Algorithm ◽

Real Time Application

This paper proposes an improved speech enhancement algorithm based on Wiener-Filtering, which addresses the problems of speech distortion and musical noise. The proposed algorithm adopts the masking properties of human auditory system on calculating the gain of spectrum point, in order that the signal in the enhanced speech whose energy is lower than the threshold will not be decreased further and the less distortion will be brought to enhanced speech by the trade-off between the noise elimination and speech signal distortion. What’s more, in order to eliminate the “musical noise”, a spectrum-shaping technology using averaging method between adjacent frames is adopted. And to guarantee the real-time application, two-stage moving-average strategy is adopted. The computer simulation results show that the proposed algorithm is superior to the traditional Wiener method in the low CPU cost, real-time statistics, the reduction of the speech distortion and residual musical noise.

Download Full-text

An Improved Multichannel Subspace Speech Enhancement Algorithm for Balance between Noise Reduction and Speech Distortion

Proceedings of the 2020 4th International Conference on Digital Signal Processing ◽

10.1145/3408127.3408131 ◽

2020 ◽

Author(s):

Jingxian Tu ◽

Yunzhou Yao ◽

Guijiang Qin

Keyword(s):

Noise Reduction ◽

Speech Enhancement ◽

Speech Distortion ◽

Enhancement Algorithm

Download Full-text

Improved Spectral Subtraction Speech Enhancement Algorithm

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.760-762.536 ◽

2013 ◽

Vol 760-762 ◽

pp. 536-541 ◽

Cited By ~ 2

Author(s):

Yu Hong Liu ◽

Dong Mei Zhou ◽

Zhan Jun Jiang

Keyword(s):

Auditory System ◽

Speech Enhancement ◽

Critical Frequency ◽

Spectral Subtraction ◽

Human Auditory System ◽

Speech Distortion ◽

Simulation Results ◽

Enhancement Algorithm ◽

Output Snr

The paper addresses the problems of speech distortion and residual musical noise introduced by conventional spectral subtraction (SS) method for speech enhancement. In this paper, we propose a modified SS algorithm for speech enhancement based on the masking properties of human auditory system. The algorithm computes the parameters α and β dynamically according to the masking thresholds of the critical frequency segments for each speech frame. Simulation results show that the proposed algorithm is superior to the conventional SS method, not only in the improvement of output SNR, but in the reduction of the speech distortion and residual musical noise.

Download Full-text

Dual-Channel Speech Enhancement Based on Speech Presence Probability

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.c8796.019320 ◽

2020 ◽

Vol 9 (3) ◽

pp. 2198-2200

Keyword(s):

Noise Reduction ◽

Speech Enhancement ◽

Hearing Aids ◽

Speech Processing ◽

Speech Signal ◽

Reduction Process ◽

Coherent Noise ◽

Residual Noise ◽

Speech Distortion ◽

Reduction Methods

This paper introduces technology to improve sound quality, which serves the needs of media and entertainment. Major challenging problem in the speech processing applications like mobile phones, hands-free phones, car communication, teleconference systems, hearing aids, voice coders, automatic speech recognition and forensics etc., is to eliminate the background noise. Speech enhancement algorithms are widely used for these applications in order to remove the noise from degraded speech in the noisy environment. Hence, the conventional noise reduction methods introduce more residual noise and speech distortion. So, it has been found that the noise reduction process is more effective to improve the speech quality but it affects the intelligibility of the clean speech signal. In this paper, we introduce a new model of coherence-based noise reduction method for the complex noise environment in which a target speech coexists with a coherent noise around. From the coherence model, the information of speech presence probability is added to better track noise variation accurately; and during the speech presence and speech absent period, adaptive coherence-based method is adjusted. The performance of suggested method is evaluated in condition of diffuse and real street noise, and it improves the speech signal quality less speech distortion and residual noise.

Download Full-text

Dual-Channel Speech Enhancement Based on Extended Kalman Filter Relative Transfer Function Estimation

Applied Sciences ◽

10.3390/app9122520 ◽

2019 ◽

Vol 9 (12) ◽

pp. 2520 ◽

Cited By ~ 1

Author(s):

Juan M. Martín-Doñas ◽

Antonio M. Peinado ◽

Iván López-Espejo ◽

Angel Gomez

Keyword(s):

Kalman Filter ◽

Transfer Function ◽

Noise Reduction ◽

Extended Kalman Filter ◽

Speech Enhancement ◽

Speech Intelligibility ◽

Dual Channel ◽

Relative Transfer ◽

Noise Statistics ◽

Relative Transfer Function

This paper deals with speech enhancement in dual-microphone smartphones using beamforming along with postfiltering techniques. The performance of these algorithms relies on a good estimation of the acoustic channel and speech and noise statistics. In this work we present a speech enhancement system that combines the estimation of the relative transfer function (RTF) between microphones using an extended Kalman filter framework with a novel speech presence probability estimator intended to track the noise statistics’ variability. The available dual-channel information is exploited to obtain more reliable estimates of clean speech statistics. Noise reduction is further improved by means of postfiltering techniques that take advantage of the speech presence estimation. Our proposal is evaluated in different reverberant and noisy environments when the smartphone is used in both close-talk and far-talk positions. The experimental results show that our system achieves improvements in terms of noise reduction, low speech distortion and better speech intelligibility compared to other state-of-the-art approaches.

Download Full-text

Speech Enhancement Using Generative Adversarial Network by Distilling Knowledge from Statistical Method

Applied Sciences ◽

10.3390/app9163396 ◽

2019 ◽

Vol 9 (16) ◽

pp. 3396 ◽

Cited By ~ 3

Author(s):

Jianfeng Wu ◽

Yongzhu Hua ◽

Shengying Yang ◽

Hongshuai Qin ◽

Huibin Qin

Keyword(s):

Neural Network ◽

Statistical Method ◽

Speech Enhancement ◽

Data Sets ◽

Generative Adversarial Network ◽

Adversarial Learning ◽

Noisy Speech ◽

Adversarial Network ◽

Knowledge Distillation ◽

Enhancement Algorithm

This paper presents a new deep neural network (DNN)-based speech enhancement algorithm by integrating the distilled knowledge from the traditional statistical-based method. Unlike the other DNN-based methods, which usually train many different models on the same data and then average their predictions, or use a large number of noise types to enlarge the simulated noisy speech, the proposed method does not train a whole ensemble of models and does not require a mass of simulated noisy speech. It first trains a discriminator network and a generator network simultaneously using the adversarial learning method. Then, the discriminator network and generator network are re-trained by distilling knowledge from the statistical method, which is inspired by the knowledge distillation in a neural network. Finally, the generator network is fine-tuned using real noisy speech. Experiments on CHiME4 data sets demonstrate that the proposed method achieves a more robust performance than the compared DNN-based method in terms of perceptual speech quality.

Download Full-text

Learning time-frequency mask for noisy speech enhancement using gaussian-bernoulli pre-trained deep neural networks

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-201014 ◽

2021 ◽

Vol 40 (1) ◽

pp. 849-864

Author(s):

Nasir Saleem ◽

Muhammad Irfan Khattak ◽

Mu’ath Al-Hasan ◽

Atif Jan

Keyword(s):

Neural Networks ◽

Speech Enhancement ◽

Speech Intelligibility ◽

Deep Neural Networks ◽

Training Data ◽

Learning Approaches ◽

Performance Gain ◽

Noisy Speech ◽

Time Frequency ◽

Training Scheme

Speech enhancement is a very important problem in various speech processing applications. Recently, supervised speech enhancement using deep learning approaches to estimate a time-frequency mask have proved remarkable performance gain. In this paper, we have proposed time-frequency masking-based supervised speech enhancement method for improving intelligibility and quality of the noisy speech. We believe that a large performance gain can be achieved if deep neural networks (DNNs) are layer-wise pre-trained by stacking Gaussian-Bernoulli Restricted Boltzmann Machine (GB-RBM). The proposed DNN is called as Gaussian-Bernoulli Deep Belief Network (GB-DBN) and are optimized by minimizing errors between the estimated and pre-defined masks. Non-linear Mel-Scale weighted mean square error (LMW-MSE) loss function is used as training criterion. We have examined the performance of the proposed pre-training scheme using different DNNs which are established on three time-frequency masks comprised of the ideal amplitude mask (IAM), ideal ratio mask (IRM), and phase sensitive mask (PSM). The results in different noisy conditions demonstrated that when DNNs are pre-trained by the proposed scheme provided a persistent performance gain in terms of the perceived speech intelligibility and quality. Also, the proposed pre-training scheme is effective and robust in noisy training data.

Download Full-text

Noise Reduction in Car Speech

Acta Polytechnica ◽

10.14311/1111 ◽

2009 ◽

Vol 49 (2) ◽

Author(s):

V. Bolom

Keyword(s):

Noise Reduction ◽

Speech Enhancement ◽

Mixed Model ◽

Noise Suppression ◽

Speech Signals ◽

Noisy Environment ◽

Free Communication ◽

Speech Distortion ◽

Criteria For Evaluation

This paper presents properties of chosen multichannel algorithms for speech enhancement in a noisy environment. These methods are suitable for hands-free communication in a car cabin. Criteria for evaluation of these systems are also presented. The criteria consider both the level of noise suppression and the level of speech distortion. The performance of multichannel algorithms is investigated for a mixed model of speech signals and car noise and for real signals recorded in a car.

Download Full-text

Noise Reduction Using the Standard Deviation of the Time-Frequency Bin and Modified Gain Function for Speech Enhancement in Stationary and Nonstationary Noisy Environments

2008 Congress on Image and Signal Processing ◽

10.1109/cisp.2008.657 ◽

2008 ◽

Author(s):

Soojeong Lee ◽

Soonhyob Kim

Keyword(s):

Standard Deviation ◽

Noise Reduction ◽

Speech Enhancement ◽

Noisy Environments ◽

Gain Function ◽

Time Frequency

Download Full-text