Variance based time-frequency mask estimation for unsupervised speech enhancement

2019 ◽  
Vol 78 (22) ◽  
pp. 31867-31891 ◽  
Author(s):  
Nasir Saleem ◽  
Muhammad Irfan Khattak ◽  
Gunawan Witjaksono ◽  
Gulzar Ahmad
2021 ◽  
Vol 2 ◽  
pp. 136-150
Author(s):  
Mojtaba Hasannezhad ◽  
Zhiheng Ouyang ◽  
Wei-Ping Zhu ◽  
Benoit Champagne

Author(s):  
Feng Bao ◽  
Waleed H. Abdulla

In computational auditory scene analysis, the accurate estimation of binary mask or ratio mask plays a key role in noise masking. An inaccurate estimation often leads to some artifacts and temporal discontinuity in the synthesized speech. To overcome this problem, we propose a new ratio mask estimation method in terms of Wiener filtering in each Gammatone channel. In the reconstruction of Wiener filter, we utilize the relationship of the speech and noise power spectra in each Gammatone channel to build the objective function for the convex optimization of speech power. To improve the accuracy of estimation, the estimated ratio mask is further modified based on its adjacent time–frequency units, and then smoothed by interpolating with the estimated binary masks. The objective tests including the signal-to-noise ratio improvement, spectral distortion and intelligibility, and subjective listening test demonstrate the superiority of the proposed method compared with the reference methods.


2021 ◽  
Author(s):  
Santhan Kumar Reddy Nareddula ◽  
Subrahmanyam Gorthi ◽  
Rama Krishna Sai S. Gorthi

Author(s):  
Judith Justin ◽  
Vanithamani R.

In this chapter, a speech enhancement technique is implemented using a neuro-fuzzy classifier. Noisy speech sentences from NOIZEUS and AURORA databases are taken for the study. Feature extraction is implemented through modifications in amplitude magnitude spectrograms. A four class neuro-fuzzy classifier splits the noisy speech samples into noise-only part, signal only part, more noise-less signal part, and more signal-less noise part of the time-frequency units. Appropriate weights are applied in the enhancement phase. The enhanced speech sentence is evaluated using objective measures. An analysis of the performance of the Neuro-Fuzzy 4 (NF 4) classifier is done. A comparison of the performance of the classifier with other conventional techniques is done for various noises at different noise levels. It is observed that the numerical values of the measures obtained are better when compared to the others. An overall comparison of the performance of the NF 4 classifier is done and it is inferred that NF4 outperforms the other techniques in speech enhancement.


2013 ◽  
Vol 74 (5) ◽  
pp. 770-781 ◽  
Author(s):  
Wenhao Yuan ◽  
Jiajun Lin ◽  
Wei An ◽  
Yu Wang ◽  
Ning Chen

Sign in / Sign up

Export Citation Format

Share Document