The Effect of Speech Enhancement Techniques on the Quality of Noisy Speech Signals

Author(s):  
Ahmed H. Y. Al-Noori ◽  
Atheel N. AlKhayyat ◽  
Ahmed A. Al-Hammad
Electronics ◽  
2019 ◽  
Vol 8 (8) ◽  
pp. 897 ◽  
Author(s):  
Hilman Pardede ◽  
Kalamullah Ramli ◽  
Yohan Suryanto ◽  
Nur Hayati ◽  
Alfan Presekal

The encryption process for secure voice communication may degrade the speech quality when it is applied to the speech signals before encoding them through a conventional communication system such as GSM or radio trunking. This is because the encryption process usually includes a randomization of the speech signals, and hence, when the speech is decrypted, it may perceptibly be distorted, so satisfactory speech quality for communication is not achieved. To deal with this, we could apply a speech enhancement method to improve the quality of decrypted speech. However, many speech enhancement methods work by assuming noise is present all the time, so the voice activity detector (VAD) is applied to detect the non-speech period to update the noise estimate. Unfortunately, this assumption is not valid for the decrypted speech. Since the encryption process is applied only when speech is detected, distortions from the secure communication system are characteristically different. They exist when speech is present. Therefore, a noise estimator that is able to update noise even when speech is present is needed. However, most noise estimator techniques only adapt to slow changes of noise to avoid over-estimation of noise, making them unsuitable for this task. In this paper, we propose a speech enhancement technique to improve the quality of speech from secure communication. We use a combination of the Wiener filter and spectral subtraction for the noise estimator, so our method is better at tracking fast changes of noise without over-estimating them. Our experimental results on various communication channels indicate that our method is better than other popular noise estimators and speech enhancement methods.


2012 ◽  
Vol 2012 ◽  
pp. 1-12 ◽  
Author(s):  
Novlene Zoghlami ◽  
Zied Lachiri

This paper describes a new speech enhancement approach using perceptually based noise reduction. The proposed approach is based on the application of two perceptual filtering models to noisy speech signals: the gammatone and the gammachirp filter banks with nonlinear resolution according to the equivalent rectangular bandwidth (ERB) scale. The perceptual filtering gives a number of subbands that are individually spectral weighted and modified according to two different noise suppression rules. The importance of an accurate noise estimate is related to the reduction of the musical noise artifacts in the processed speech that appears after classic subtractive process. In this context, we use continuous noise estimation algorithms. The performance of the proposed approach is evaluated on speech signals corrupted by real-world noises. Using objective tests based on the perceptual quality PESQ score and the quality rating of signal distortion (SIG), noise distortion (BAK) and overall quality (OVRL), and subjective test based on the quality rating of automatic speech recognition (ASR), we demonstrate that our speech enhancement approach using filter banks modeling the human auditory system outperforms the conventional spectral modification algorithms to improve quality and intelligibility of the enhanced speech signal.


Signals ◽  
2021 ◽  
Vol 2 (3) ◽  
pp. 434-455
Author(s):  
Sujan Kumar Roy ◽  
Kuldip K. Paliwal

Inaccurate estimates of the linear prediction coefficient (LPC) and noise variance introduce bias in Kalman filter (KF) gain and degrade speech enhancement performance. The existing methods propose a tuning of the biased Kalman gain, particularly in stationary noise conditions. This paper introduces a tuning of the KF gain for speech enhancement in real-life noise conditions. First, we estimate noise from each noisy speech frame using a speech presence probability (SPP) method to compute the noise variance. Then, we construct a whitening filter (with its coefficients computed from the estimated noise) to pre-whiten each noisy speech frame prior to computing the speech LPC parameters. We then construct the KF with the estimated parameters, where the robustness metric offsets the bias in KF gain during speech absence of noisy speech to that of the sensitivity metric during speech presence to achieve better noise reduction. The noise variance and the speech model parameters are adopted as a speech activity detector. The reduced-biased Kalman gain enables the KF to minimize the noise effect significantly, yielding the enhanced speech. Objective and subjective scores on the NOIZEUS corpus demonstrate that the enhanced speech produced by the proposed method exhibits higher quality and intelligibility than some benchmark methods.


2021 ◽  
pp. 2150022
Author(s):  
Caio Cesar Enside de Abreu ◽  
Marco Aparecido Queiroz Duarte ◽  
Bruno Rodrigues de Oliveira ◽  
Jozue Vieira Filho ◽  
Francisco Villarreal

Speech processing systems are very important in different applications involving speech and voice quality such as automatic speech recognition, forensic phonetics and speech enhancement, among others. In most of them, the acoustic environmental noise is added to the original signal, decreasing the signal-to-noise ratio (SNR) and the speech quality by consequence. Therefore, estimating noise is one of the most important steps in speech processing whether to reduce it before processing or to design robust algorithms. In this paper, a new approach to estimate noise from speech signals is presented and its effectiveness is tested in the speech enhancement context. For this purpose, partial least squares (PLS) regression is used to model the acoustic environment (AE) and a Wiener filter based on a priori SNR estimation is implemented to evaluate the proposed approach. Six noise types are used to create seven acoustically modeled noises. The basic idea is to consider the AE model to identify the noise type and estimate its power to be used in a speech processing system. Speech signals processed using the proposed method and classical noise estimators are evaluated through objective measures. Results show that the proposed method produces better speech quality than state-of-the-art noise estimators, enabling it to be used in real-time applications in the field of robotic, telecommunications and acoustic analysis.


2009 ◽  
Vol 22 (3) ◽  
pp. 391-404
Author(s):  
Zoran Milivojevic ◽  
Dragisa Balaneskovic

This paper presents an algorithm for enhancement of the noisy speech signal quality. This algorithm is based on the dissonant frequency filtering (DFF), F#, B and C# in relation to the frequency of the primary tone C (DFF-FBC algorithm). By means of the subjective Mean Opinion Score (MOS) test, the effect of the enhancement of the speech signal quality was analyzed. The analysis of the MOS test results, presented in the second part of this paper, points out to the enhancement of the noisy speech signal quality in the presence of superimposed noises. Especially good results have been found with Husky Voice signal. .


Author(s):  
Syed Akhter Hossain ◽  
M. Lutfar Rahman ◽  
Faruk Ahmed ◽  
M. Abdus Sobhan

The aim of this chapter is to clearly understand the salient features of Bangla vowels and the sources of acoustic variability in Bangla vowels, and to suggest classification of vowels based on normalized acoustic parameters. Possible applications in automatic speech recognition and speech enhancement have made the classification of vowels an important problem to study. However, Bangla vowels spoken by different native speakers show great variations in their respective formant values. This brings further complications in the acoustic comparison of vowels due to different dialect and language backgrounds of the speakers. This variation necessitates the use of normalization procedures to remove the effect of non-linguistic factors. Although several researchers found a number of acoustical and perceptual correlates of vowels, acoustic parameters that work well in a speaker-independent manner are yet to be found. Besides, study of acoustic features of Bangla dental consonants to identify the spectral differences between different consonants and to parameterize them for the synthesis of the segments is another problem area for study. The extracted features for both Bangla vowels and dental consonants are tested and found with good synthetic representations that demonstrate the quality of acoustic features.


Author(s):  
Judith Justin ◽  
Vanithamani R.

In this chapter, a speech enhancement technique is implemented using a neuro-fuzzy classifier. Noisy speech sentences from NOIZEUS and AURORA databases are taken for the study. Feature extraction is implemented through modifications in amplitude magnitude spectrograms. A four class neuro-fuzzy classifier splits the noisy speech samples into noise-only part, signal only part, more noise-less signal part, and more signal-less noise part of the time-frequency units. Appropriate weights are applied in the enhancement phase. The enhanced speech sentence is evaluated using objective measures. An analysis of the performance of the Neuro-Fuzzy 4 (NF 4) classifier is done. A comparison of the performance of the classifier with other conventional techniques is done for various noises at different noise levels. It is observed that the numerical values of the measures obtained are better when compared to the others. An overall comparison of the performance of the NF 4 classifier is done and it is inferred that NF4 outperforms the other techniques in speech enhancement.


Sign in / Sign up

Export Citation Format

Share Document