SpecMix : A Mixed Sample Data Augmentation method for Training with Time-Frequency Domain Features

Mapping Intimacies ◽

10.31219/osf.io/ubcft ◽

2021 ◽

Author(s):

Gwantae Kim ◽

David K. Han ◽

Hanseok Ko

Keyword(s):

Frequency Domain ◽

Speech Enhancement ◽

Data Augmentation ◽

Scene Classification ◽

Spectral Correlation ◽

Event Classification ◽

Time Frequency ◽

Mixed Sample ◽

Sound Event ◽

A mixed sample data augmentation strategy is proposed to enhance the performance of models on audio scene classification, sound event classification, and speech enhancement tasks. While there have been several augmentation methods shown to be effective in improving image classification performance, their efficacy toward time-frequency domain features of audio is not assured. We propose a novel audio data augmentation approach named "Specmix" specifically designed for dealing with time-frequency domain features. The augmentation method consists of mixing two different data samples by applying time-frequency masks effective in preserving the spectral correlation of each audio sample. Our experiments on acoustic scene classification, sound event classification, and speech enhancement tasks show that the proposed Specmix improves the performance of various neural network architectures by a maximum of 2.7\%.

Download Full-text

SpecMix : A Mixed Sample Data Augmentation Method for Training with Time-Frequency Domain Features

10.21437/interspeech.2021-103 ◽

2021 ◽

Author(s):

Gwantae Kim ◽

David K. Han ◽

Hanseok Ko

Keyword(s):

Frequency Domain ◽

Data Augmentation ◽

Time Frequency ◽

Mixed Sample ◽

Download Full-text

ChannelMix: A Mixed Sample Data Augmentation Strategy for Image Classification

2021 6th International Conference on Intelligent Computing and Signal Processing (ICSP) ◽

10.1109/icsp51882.2021.9408747 ◽

2021 ◽

Author(s):

Xu Cao ◽

HuanXin Zou ◽

XinYi Ying ◽

RunLin Li ◽

ShiTian He ◽

...

Keyword(s):

Image Classification ◽

Data Augmentation ◽

Mixed Sample ◽

Sample Data ◽

Augmentation Strategy

Download Full-text

Neural speech enhancement in the time-frequency domain

2003 IEEE XIII Workshop on Neural Networks for Signal Processing (IEEE Cat. No.03TH8718) ◽

10.1109/nnsp.2003.1318061 ◽

2003 ◽

Author(s):

M. Volkmer

Keyword(s):

Frequency Domain ◽

Speech Enhancement ◽

Download Full-text

Continuous robust sound event classification using time-frequency features and deep learning

PLoS ONE ◽

10.1371/journal.pone.0182309 ◽

2017 ◽

Vol 12 (9) ◽

pp. e0182309 ◽

Author(s):

Ian McLoughlin ◽

Haomin Zhang ◽

Zhipeng Xie ◽

Yan Song ◽

Wei Xiao ◽

...

Keyword(s):

Deep Learning ◽

Event Classification ◽

Time Frequency ◽

Sound Event ◽

Frequency Features

Download Full-text

Frequency Domain Estimation of Continuous Time Cointegrated Models with Mixed Frequency and Mixed Sample Data

Journal of Time Series Analysis ◽

10.1111/jtsa.12461 ◽

2019 ◽

Vol 40 (6) ◽

pp. 887-913 ◽

Author(s):

Marcus J. Chambers

Keyword(s):

Frequency Domain ◽

Continuous Time ◽

Mixed Frequency ◽

Mixed Sample ◽

Download Full-text

Frequency domain estimation of cointegrating vectors with mixed frequency and mixed sample data

Journal of Econometrics ◽

10.1016/j.jeconom.2019.10.010 ◽

2020 ◽

Vol 217 (1) ◽

pp. 140-160

Author(s):

Marcus J. Chambers

Keyword(s):

Frequency Domain ◽

Mixed Frequency ◽

Mixed Sample ◽

Download Full-text

Improved Speech Enhancement Using a Complex-Domain GAN with Fused Time-Domain and Time-Frequency Domain Constraints

10.21437/interspeech.2021-1134 ◽

2021 ◽

Author(s):

Feng Dang ◽

Pengyuan Zhang ◽

Hangting Chen

Keyword(s):

Frequency Domain ◽

Speech Enhancement ◽

Time Domain ◽

Complex Domain ◽

Time Frequency ◽

Domain Constraints

Download Full-text

Phase reconstruction method based on time-frequency domain harmonic structure for speech enhancement

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp.2017.7953220 ◽

2017 ◽

Author(s):

Yukoh Wakabayashi ◽

Takahiro Fukumori ◽

Masato Nakayama ◽

Takanobu Nishiura ◽

Yoichi Yamashita

Keyword(s):

Frequency Domain ◽

Speech Enhancement ◽

Reconstruction Method ◽

Harmonic Structure ◽

Phase Reconstruction ◽

Download Full-text

Sequence-Level Mixed Sample Data Augmentation

10.18653/v1/2020.emnlp-main.447 ◽

2020 ◽

Author(s):

Demi Guo ◽

Yoon Kim ◽

Alexander Rush

Keyword(s):

Data Augmentation ◽

Mixed Sample ◽

Download Full-text

Speech enhancement in joint time-frequency domain based on real-valued discrete gabor transform

2010 5th International Conference on Computer Science & Education ◽

10.1109/iccse.2010.5593404 ◽

2010 ◽

Author(s):

Jian Zhou ◽

Liang Tao

Keyword(s):

Frequency Domain ◽

Speech Enhancement ◽

Gabor Transform ◽

Download Full-text