Time-Frequency Masking-Based Speech Enhancement Using Generative Adversarial Network

In the field of target classification, detecting a ground moving target that is easily covered in clutter has been a challenge. In addition, traditional feature extraction techniques and classification methods usually rely on strong subjective factors and prior knowledge, which affect their generalization capacity. Most existing deep-learning-based methods suffer from insufficient feature learning due to the lack of data samples, which makes it difficult for the training process to converge to a steady-state. To overcome these limitations, this paper proposes a Wasserstein generative adversarial network (WGAN) sample enhancement method for ground moving target classification (GMT-WGAN). First, the micro-Doppler characteristics of ground moving targets are analyzed. Next, a WGAN is constructed to generate effective time–frequency images of ground moving targets and thereby enrich the sample database used to train the classification network. Then, image quality evaluation indexes are introduced to evaluate the generated spectrogram samples, with an aim to verify the distribution similarity of generated and real samples. Afterward, by feeding augmented samples to the deep convolutional neural networks with good generalization capacity, the classification performance of the GMT-WGAN is improved. Finally, experiments conducted on different datasets validate the effectiveness and robustness of the proposed method.

Download Full-text

Generative Adversarial Neural Network for Unsupervised Bearing Fault Detection

INTER-NOISE and NOISE-CON Congress and Conference Proceedings ◽

10.3397/in-2021-2479 ◽

2021 ◽

Vol 263 (3) ◽

pp. 3643-3648

Author(s):

Gyuwon Kim ◽

Seungchul Lee

Keyword(s):

Economic Loss ◽

Reconstruction Error ◽

Fault Diagnostics ◽

Safety Hazards ◽

Generative Adversarial Network ◽

Time Frequency ◽

Bearing Fault ◽

Electrical Systems ◽

Adversarial Network ◽

Short Time

Detecting bearing faults in advance is critical for mechanical and electrical systems to prevent economic loss and safety hazards. As part of the recent interest in artificial intelligence, deep learning (DL)-based principles have gained much attention in intelligent fault diagnostics and have mainly been developed in a supervised manner. While these works have shown promising results, several technical setbacks are inherent in a supervised learning setting. Data imbalance is a critical problem as faulty data is scarce in many cases, data labeling is tedious, and unseen cases of faults cannot be detected in a supervised framework. Herein, a generative adversarial network (GAN) is proposed to achieve unsupervised bearing fault diagnostics by utilizing only the normal data. The proposed method first adopts the short-time Fourier transform (STFT) to convert the 1-D vibration signals into 2-D time-frequency representations to use as the input to our (DL) framework. Subsequently, a GAN-based latent mapping is constructed using only the normal data, and faulty signals are detected using an anomaly metric comprised of a discriminator error and an image reconstruction error. The performance of our method is verified using a classic rotating machinery dataset (Case Western Reserve bearing dataset), and the experimental results demonstrate that our method can not only detect the faults but can also cluster the faults in the latent space with high accuracy.

Download Full-text

Language and Noise Transfer in Speech Enhancement Generative Adversarial Network

2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp.2018.8462322 ◽

2018 ◽

Cited By ~ 6

Author(s):

Santiago Pascual ◽

Maruchan Park ◽

Joan Serra ◽

Antonio Bonafonte ◽

Kang-Hun Ahn

Keyword(s):

Speech Enhancement ◽

Generative Adversarial Network ◽

Adversarial Network

Download Full-text

CP-GAN: Context Pyramid Generative Adversarial Network for Speech Enhancement

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp40776.2020.9054060 ◽

2020 ◽

Author(s):

Gang Liu ◽

Ke Gong ◽

Xiaodan Liang ◽

Zhiguang Chen

Keyword(s):

Speech Enhancement ◽

Generative Adversarial Network ◽

Adversarial Network

Download Full-text

Improved Wasserstein conditional generative adversarial network speech enhancement

EURASIP Journal on Wireless Communications and Networking ◽

10.1186/s13638-018-1196-0 ◽

2018 ◽

Vol 2018 (1) ◽

Cited By ~ 5

Author(s):

Shan Qin ◽

Ting Jiang

Keyword(s):

Speech Enhancement ◽

Generative Adversarial Network ◽

Adversarial Network

Download Full-text

Speech Enhancement Using Generative Adversarial Network by Distilling Knowledge from Statistical Method

Applied Sciences ◽

10.3390/app9163396 ◽

2019 ◽

Vol 9 (16) ◽

pp. 3396 ◽

Cited By ~ 3

Author(s):

Jianfeng Wu ◽

Yongzhu Hua ◽

Shengying Yang ◽

Hongshuai Qin ◽

Huibin Qin

Keyword(s):

Neural Network ◽

Statistical Method ◽

Speech Enhancement ◽

Data Sets ◽

Generative Adversarial Network ◽

Adversarial Learning ◽

Noisy Speech ◽

Adversarial Network ◽

Knowledge Distillation ◽

Enhancement Algorithm

This paper presents a new deep neural network (DNN)-based speech enhancement algorithm by integrating the distilled knowledge from the traditional statistical-based method. Unlike the other DNN-based methods, which usually train many different models on the same data and then average their predictions, or use a large number of noise types to enlarge the simulated noisy speech, the proposed method does not train a whole ensemble of models and does not require a mass of simulated noisy speech. It first trains a discriminator network and a generator network simultaneously using the adversarial learning method. Then, the discriminator network and generator network are re-trained by distilling knowledge from the statistical method, which is inspired by the knowledge distillation in a neural network. Finally, the generator network is fine-tuned using real noisy speech. Experiments on CHiME4 data sets demonstrate that the proposed method achieves a more robust performance than the compared DNN-based method in terms of perceptual speech quality.

Download Full-text

SEGAN: Speech Enhancement Generative Adversarial Network

10.21437/interspeech.2017-1428 ◽

2017 ◽

Cited By ~ 147

Author(s):

Santiago Pascual ◽

Antonio Bonafonte ◽

Joan Serrà

Keyword(s):

Speech Enhancement ◽

Generative Adversarial Network ◽

Adversarial Network

Download Full-text

Time-Frequency Masking-Based Speech Enhancement Using Generative Adversarial Network

Time-Frequency Mask-based Speech Enhancement using Convolutional Generative Adversarial Network

Self-Attention Generative Adversarial Network for Speech Enhancement

A Loss With Mixed Penalty for Speech Enhancement Generative Adversarial Network

GMT-WGAN: An Adversarial Sample Expansion Method for Ground Moving Targets Classification

Generative Adversarial Neural Network for Unsupervised Bearing Fault Detection

Language and Noise Transfer in Speech Enhancement Generative Adversarial Network

CP-GAN: Context Pyramid Generative Adversarial Network for Speech Enhancement

Improved Wasserstein conditional generative adversarial network speech enhancement

Speech Enhancement Using Generative Adversarial Network by Distilling Knowledge from Statistical Method

SEGAN: Speech Enhancement Generative Adversarial Network

Export Citation Format