scholarly journals Removal of Pink Noise from Corrupted Speech Signal using Kalman Filter

Speech enhancement has been a major challenge in the field of Signal processing. The process of filtering the noise component from the speech signal has achieved many milestones since the early 20th century. Beside many theories Linear prediction coding is one of the best methods for speech, audio signal processing which uses the algorithm of predicting the current estimates based on the past states of an LTI system. Linear prediction is usually used in Speech recognition, Speech enhancement. One of such Kalman filter was introduced and described in 1960 by Rudolf Kalman, which uses the concept of linear quadratic estimation. Kalman filtering is effectively being used in the practical applications like navigation of ships or aircraft, designing motion planning algorithms, in communication area. Kalman filters use the autoregression model of speech for the recursive equations of Kalman filter used in state space model of filter for state estimation. In this paper, we have used Kalman filter to eliminate the pink noise from the corrupted speech signal. Pink noise is very common in electronic devices and occurs in almost all devices. The Speech corrupted with pink noise has been obtained from SpEAR database. We have used MATLAB software for the simulation purpose. Finally, Spectrograms of signals are plotted for a better visual understanding of filtered results.

2021 ◽  
Vol 11 (14) ◽  
pp. 6288
Author(s):  
Hang Su ◽  
Chang-Myung Lee

The generalized sidelobe canceller (GSC) method is a common algorithm to enhance audio signals using a microphone array. Distortion of the enhanced audio signal consists of two parts: the residual acoustic noise and the distortion of the desired audio signal, which means that the desired audio signal is damaged. This paper proposes a modified GSC method to reduce both kinds of distortion when the desired audio signal is a non-stationary speech signal. First, the cross-correlation coefficient between the canceling signal and the error signal of the least mean square (LMS) algorithm was added to the adaptive process of the GSC method to reduce the distortion of the enhanced signal while the energy of the desired signal frame was increased suddenly. The sidelobe pattern of beamforming was then presented to estimate the noise signal in the beamforming output signal of the GSC method. The noise component of the beamforming output signal was decreased by subtracting the estimated noise signal to improve the denoising performance of the GSC method. Finally, the GSC-SN-MCC method was proposed by merging the above two methods. The experiment was performed in an anechoic chamber to validate the proposed method in various SNR conditions. Furthermore, the simulated calculation with inaccurate noise directions was conducted based on the experiment data to inspect the robustness of the proposed method to the error of the estimated noise direction. The experiment data and calculation results indicated that the proposed method could reduce the distortion effectively under various SNR conditions and would not cause more distortion if the estimated noise direction is far from the actual noise direction.


Signals ◽  
2021 ◽  
Vol 2 (3) ◽  
pp. 434-455
Author(s):  
Sujan Kumar Roy ◽  
Kuldip K. Paliwal

Inaccurate estimates of the linear prediction coefficient (LPC) and noise variance introduce bias in Kalman filter (KF) gain and degrade speech enhancement performance. The existing methods propose a tuning of the biased Kalman gain, particularly in stationary noise conditions. This paper introduces a tuning of the KF gain for speech enhancement in real-life noise conditions. First, we estimate noise from each noisy speech frame using a speech presence probability (SPP) method to compute the noise variance. Then, we construct a whitening filter (with its coefficients computed from the estimated noise) to pre-whiten each noisy speech frame prior to computing the speech LPC parameters. We then construct the KF with the estimated parameters, where the robustness metric offsets the bias in KF gain during speech absence of noisy speech to that of the sensitivity metric during speech presence to achieve better noise reduction. The noise variance and the speech model parameters are adopted as a speech activity detector. The reduced-biased Kalman gain enables the KF to minimize the noise effect significantly, yielding the enhanced speech. Objective and subjective scores on the NOIZEUS corpus demonstrate that the enhanced speech produced by the proposed method exhibits higher quality and intelligibility than some benchmark methods.


2021 ◽  
Author(s):  
Sujan Kumar Roy ◽  
Aaron Nicolson ◽  
Kuldip K. Paliwal

Current augmented Kalman filter (AKF)-based speech enhancement algorithms utilise a temporal convolutional network (TCN) to estimate the clean speech and noise linear prediction coefficient (LPC). However, the multi-head attention network (MHANet) has demonstrated the ability to more efficiently model the long-term dependencies of noisy speech than TCNs. Motivated by this, we investigate the MHANet for LPC estimation. We aim to produce clean speech and noise LPC parameters with the least bias to date. With this, we also aim to produce higher quality and more intelligible enhanced speech than any current KF or AKF-based SEA. Here, we investigate MHANet within the DeepLPC framework. DeepLPC is a deep learning framework for jointly estimating the clean speech and noise LPC power spectra. DeepLPC is selected as it exhibits significantly less bias than other frameworks, by avoiding the use of whitening filters and post-processing. DeepLPC-MHANet is evaluated on the NOIZEUS corpus using subjective AB listening tests, as well as seven different objective measures (CSIG, CBAK, COVL, PESQ, STOI, SegSNR, and SI-SDR). DeepLPC-MHANet is compared to five existing deep learning-based methods. Compared to other deep learning approaches, DeepLPC-MHANet produced clean speech LPC estimates with the least amount of bias. DeepLPC-MHANet-AKF also produced higher objective scores than any of the competing methods (with an improvement of 0.17 for CSIG, 0.15 for CBAK, 0.19 for COVL, 0.24 for PESQ, 3.70\% for STOI, 1.03 dB for SegSNR, and 1.04 dB for SI-SDR over the next best method). The enhanced speech produced by DeepLPC-MHANet-AKF was also the most preferred amongst ten listeners. By producing LPC estimates with the least amount of bias to date, DeepLPC-MHANet enables the AKF to produce enhanced speech at a higher quality and intelligibility than any previous method.


Author(s):  
Sujan Kumar Roy ◽  
Kuldip K. Paliwal

The inaccurate estimates of linear prediction coefficient (LPC) and noise variance introduce bias in Kalman filter (KF) gain and degrades speech enhancement performance. The existing methods proposed a tuning of the biased Kalman gain particularly in stationary noise condition. This paper introduces a tuning of the KF gain for speech enhancement in real-life noise conditions. First, we estimate noise from each noisy speech frame using a speech presence probability (SPP) method to compute the noise variance. Then construct a whitening filter (with its coefficients computed from the estimated noise) and employed to the noisy speech, yielding a pre-whitened speech, from where the speech LPC parameters are computed. Then construct KF with the estimated parameters, where the robustness metric offsets the bias in Kalman gain during speech absence to that of the sensitivity metric during speech presence to achieve better noise reduction. Where the noise variance and the speech model parameters are adopted as a speech activity detector. The reduced-biased Kalman gain enables the KF to minimize the noise effect significantly, yielding the enhanced speech. Objective and subjective scores on NOIZEUS corpus demonstrates that the enhanced speech produced by the proposed method exhibits higher quality and intelligibility than some benchmark methods.


Symmetry ◽  
2021 ◽  
Vol 13 (2) ◽  
pp. 240
Author(s):  
Cristian Busu ◽  
Mihail Busu

Kalman filtering is a linear quadratic estimation (LQE) algorithm that uses a time series of observed data to produce estimations of unknown variables. The Kalman filter (KF) concept is widely used in applied mathematics and signal processing. In this study, we developed a methodology for estimating Gaussian errors by minimizing the symmetric loss function. Relevant applications of the kinetic models are described at the end of the manuscript.


Author(s):  
Mario Barnard ◽  
Farag M. Lagnf ◽  
Amr S. Mahmoud ◽  
Mohamed Zohdy

In this paper, a Radial Basis Function-based Kalman filter has been utilized to perform speech enhancement of an audio signal. Moreover, in order to accomplish speech recognition, correlation after detecting signal envelop has been applied. Based on the simulation result, it shows that using the radial basis function-based Kalman filter (non-linear functions to estimate Q parameter) should lead to obtain better results.


Mathematics ◽  
2019 ◽  
Vol 7 (7) ◽  
pp. 580 ◽  
Author(s):  
Tomas Skovranek ◽  
Vladimir Despotovic

Fractional linear prediction (FLP), as a generalization of conventional linear prediction (LP), was recently successfully applied in different fields of research and engineering, such as biomedical signal processing, speech modeling and image processing. The FLP model has a similar design as the conventional LP model, i.e., it uses a linear combination of “fractional terms” with different orders of fractional derivative. Assuming only one “fractional term” and using limited number of previous samples for prediction, FLP model with “restricted memory” is presented in this paper and the closed-form expressions for calculation of FLP coefficients are derived. This FLP model is fully comparable with the widely used low-order LP, as it uses the same number of previous samples, but less predictor coefficients, making it more efficient. Two different datasets, MIDI Aligned Piano Sounds (MAPS) and Orchset, were used for the experiments. Triads representing the chords composed of three randomly chosen notes and usual Western musical chords (both of them from MAPS dataset) served as the test signals, while the piano recordings from MAPS dataset and orchestra recordings from the Orchset dataset served as the musical signal. The results show enhancement of FLP over LP in terms of model complexity, whereas the performance is comparable.


1997 ◽  
Vol 34 (2) ◽  
pp. 161-172 ◽  
Author(s):  
Robin W. King

Three MATLAB exercises covering speech signal analysis and principles of linear prediction, formant synthesis and speech recognition are described. These exercises, which are assessed components in an elective course on speech and language processing, enable undergraduate electrical engineering students to explore fundamentally important concepts in speech science and signal processing.


Sign in / Sign up

Export Citation Format

Share Document