Environmental Attention-Guided Branchy Neural Network for Speech Enhancement

The performance of speech enhancement algorithms can be further improved by considering the application scenarios of speech products. In this paper, we propose an attention-based branchy neural network framework by incorporating the prior environmental information for noise reduction. In the whole denoising framework, first, an environment classification network is trained to distinguish the noise type of each noisy speech frame. Guided by this classification network, the denoising network gradually learns respective noise reduction abilities in different branches. Unlike most deep neural network (DNN)-based methods, which learn speech reconstruction capabilities with a common neural structure from all training noises, the proposed branchy model obtains greater performance benefits from the specially trained branches of prior known noise interference types. Experimental results show that the proposed branchy DNN model not only preserved better enhanced speech quality and intelligibility in seen noisy environments, but also obtained good generalization in unseen noisy environments.

Download Full-text

A Hybrid Approach for Single Channel Speech Enhancement using Deep Neural Network and Harmonic Regeneration Noise Reduction

International Journal of Advanced Computer Science and Applications ◽

10.14569/ijacsa.2020.0111033 ◽

2020 ◽

Vol 11 (10) ◽

Author(s):

Norezmi Jamal ◽

N. Fuad ◽

MNAH. Shaabani

Keyword(s):

Neural Network ◽

Noise Reduction ◽

Speech Enhancement ◽

Deep Neural Network ◽

Single Channel ◽

Hybrid Approach ◽

Harmonic Regeneration

Download Full-text

Deep neural network and noise classification-based speech enhancement

Modern Physics Letters B ◽

10.1142/s0217984917400966 ◽

2017 ◽

Vol 31 (19-21) ◽

pp. 1740096 ◽

Cited By ~ 10

Author(s):

Wenhua Shi ◽

Xiongwei Zhang ◽

Xia Zou ◽

Wei Han

Keyword(s):

Neural Network ◽

Speech Enhancement ◽

Deep Neural Network ◽

Gaussian Mixture ◽

Noise Type ◽

Noisy Observation ◽

Enhancement Method ◽

Noise Classification ◽

Objective Speech Quality ◽

The Relationship

In this paper, a speech enhancement method using noise classification and Deep Neural Network (DNN) was proposed. Gaussian mixture model (GMM) was employed to determine the noise type in speech-absent frames. DNN was used to model the relationship between noisy observation and clean speech. Once the noise type was determined, the corresponding DNN model was applied to enhance the noisy speech. GMM was trained with mel-frequency cepstrum coefficients (MFCC) and the parameters were estimated with an iterative expectation-maximization (EM) algorithm. Noise type was updated by spectrum entropy-based voice activity detection (VAD). Experimental results demonstrate that the proposed method could achieve better objective speech quality and smaller distortion under stationary and non-stationary conditions.

Download Full-text

Speech enhancement methods based on binaural cue coding

EURASIP Journal on Audio Speech and Music Processing ◽

10.1186/s13636-019-0164-x ◽

2019 ◽

Vol 2019 (1) ◽

Author(s):

Xianyun Wang ◽

Changchun Bao

Keyword(s):

Neural Network ◽

Speech Enhancement ◽

Speech Signal ◽

Deep Neural Network ◽

Experimental Results ◽

Channel Correlation ◽

Noisy Speech ◽

Time Frequency ◽

Improve Accuracy ◽

Better Than

AbstractAccording to the encoding and decoding mechanism of binaural cue coding (BCC), in this paper, the speech and noise are considered as left channel signal and right channel signal of the BCC framework, respectively. Subsequently, the speech signal is estimated from noisy speech when the inter-channel level difference (ICLD) and inter-channel correlation (ICC) between speech and noise are given. In this paper, exact inter-channel cues and the pre-enhanced inter-channel cues are used for speech restoration. The exact inter-channel cues are extracted from clean speech and noise, and the pre-enhanced inter-channel cues are extracted from the pre-enhanced speech and estimated noise. After that, they are combined one by one to form a codebook. Once the pre-enhanced cues are extracted from noisy speech, the exact cues are estimated by a mapping between the pre-enhanced cues and a prior codebook. Next, the estimated exact cues are used to obtain a time-frequency (T-F) mask for enhancing noisy speech based on the decoding of BCC. In addition, in order to further improve accuracy of the T-F mask based on the inter-channel cues, the deep neural network (DNN)-based method is proposed to learn the mapping relationship between input features of noisy speech and the T-F masks. Experimental results show that the codebook-driven method can achieve better performance than conventional methods, and the DNN-based method performs better than the codebook-driven method.

Download Full-text

Speech enhancement based on noise classification and deep neural network

Modern Physics Letters B ◽

10.1142/s0217984919501884 ◽

2019 ◽

Vol 33 (17) ◽

pp. 1950188

Author(s):

Wenbo Wang ◽

Houguang Liu ◽

Jianhua Yang ◽

Guohua Cao ◽

Chunli Hua

Keyword(s):

Neural Network ◽

Speech Enhancement ◽

Deep Neural Network ◽

Great Influence ◽

Activity Detection ◽

Noise Type ◽

Noise Classification ◽

The Voice ◽

Voice Activity ◽

Huge Variation

Deep neural network (DNN) has recently been successfully adopted as a regression model in speech enhancement. Nonetheless, training machines to adapt different noise is a challenging task. Because every noise has its own characteristics which can be combined with speech utterance to give huge variation on which the model has to operate on. Thus, a joint framework combining noise classification (NC) and speech enhancement using DNN was proposed. We first determined the noise type of contaminated speech by the voice activity detection (VAD)-DNN and the NC-DNN. Then based on the noise classification results, the corresponding SE-DNN model was applied to enhance the contaminated speech. In addition, in order to make method simpler, the structure of different DNNs was similar and the features were the same. Experimental results show that the proposed method effectively improved the performance of speech enhancement in complex noise environments. Besides, the accuracy of classification had a great influence on speech enhancement.

Download Full-text

Error Modeling via Asymmetric Laplace Distribution for Deep Neural Network Based Single-Channel Speech Enhancement

10.21437/interspeech.2018-1439 ◽

2018 ◽

Author(s):

Li Chai ◽

Jun Du ◽

Chin-Hui Lee

Keyword(s):

Neural Network ◽

Speech Enhancement ◽

Deep Neural Network ◽

Single Channel ◽

Laplace Distribution ◽

Error Modeling ◽

Asymmetric Laplace Distribution

Download Full-text

Speech Enhancement for Punjabi Language Using Deep Neural Network

2019 International Conference on Signal Processing and Communication (ICSC) ◽

10.1109/icsc45622.2019.8938309 ◽

2019 ◽

Author(s):

Jaspreet Singh ◽

Kamaldeep Kaur

Keyword(s):

Neural Network ◽

Speech Enhancement ◽

Deep Neural Network

Download Full-text

A Perceptually Motivated Approach for Speech Enhancement Based on Deep Neural Network

IEICE Transactions on Fundamentals of Electronics Communications and Computer Sciences ◽

10.1587/transfun.e99.a.835 ◽

2016 ◽

Vol E99.A (4) ◽

pp. 835-838 ◽

Cited By ~ 2

Author(s):

Wei HAN ◽

Xiongwei ZHANG ◽

Gang MIN ◽

Meng SUN

Keyword(s):

Neural Network ◽

Speech Enhancement ◽

Deep Neural Network

Download Full-text

NOISE-ADAPTIVE DEEP NEURAL NETWORK FOR SINGLE-CHANNEL SPEECH ENHANCEMENT

2018 IEEE 28th International Workshop on Machine Learning for Signal Processing (MLSP) ◽

10.1109/mlsp.2018.8517027 ◽

2018 ◽

Cited By ~ 1

Author(s):

Hanwook Chung ◽

Taesup Kim ◽

Eric Plourde ◽

Benoit Champagne

Keyword(s):

Neural Network ◽

Speech Enhancement ◽

Deep Neural Network ◽

Single Channel

Download Full-text

An Improved Fully Convolutional Network Based on Post-Processing with Global Variance Equalization and Noise-Aware Training for Speech Enhancement

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2021.p0130 ◽

2021 ◽

Vol 25 (1) ◽

pp. 130-137

Author(s):

Wenlong Li ◽

◽

Kaoru Hirota ◽

Yaping Dai ◽

Zhiyang Jia

Keyword(s):

Neural Network ◽

Speech Enhancement ◽

Deep Neural Network ◽

Voice Conversion ◽

Post Processing ◽

Generalization Capability ◽

Convolutional Network ◽

Fully Convolutional Network ◽

Subjective Score ◽

Model Training

An improved fully convolutional network based on post-processing with global variance (GV) equalization and noise-aware training (PN-FCN) for speech enhancement model is proposed. It aims at reducing the complexity of the speech improvement system, and it solves overly smooth speech signal spectrogram problem and poor generalization capability. The PN-FCN is fed with the noisy speech samples augmented with an estimate of the noise. In this way, the PN-FCN uses additional online noise information to better predict the clean speech. Besides, PN-FCN uses the global variance information, which improve the subjective score in a voice conversion task. Finally, the proposed framework adopts FCN, and the number of parameters is one-seventh of deep neural network (DNN). Results of experiments on the Valentini-Botinhaos dataset demonstrate that the proposed framework achieves improvements in both denoising effect and model training speed.

Download Full-text

Single-Channel Speech Enhancement Based on Sparse Regressive Deep Neural Network

Software Engineering and Applications ◽

10.12677/sea.2017.61002 ◽

2017 ◽

Vol 06 (01) ◽

pp. 8-19

Author(s):

海霞孙

Keyword(s):

Neural Network ◽

Speech Enhancement ◽

Deep Neural Network ◽

Single Channel

Download Full-text