Single-Channel Speech Enhancement in the Time Domain

Automatic sleep staging with only one channel is a challenging problem in sleep-related research. In this paper, a simple and efficient method named PPG-based multi-class automatic sleep staging (PMSS) is proposed using only a photoplethysmography (PPG) signal. Single-channel PPG data were obtained from four categories of subjects in the CAP sleep database. After the preprocessing of PPG data, feature extraction was performed from the time domain, frequency domain, and nonlinear domain, and a total of 21 features were extracted. Finally, the Light Gradient Boosting Machine (LightGBM) classifier was used for multi-class sleep staging. The accuracy of the multi-class automatic sleep staging was over 70%, and the Cohen’s kappa statistic k was over 0.6. This also showed that the PMSS method can also be applied to stage the sleep state for patients with sleep disorders.

Download Full-text

TSTNN: Two-Stage Transformer Based Neural Network for Speech Enhancement in the Time Domain

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp39728.2021.9413740 ◽

2021 ◽

Author(s):

Kai Wang ◽

Bengbeng He ◽

Wei-Ping Zhu

Keyword(s):

Neural Network ◽

Speech Enhancement ◽

Time Domain ◽

Two Stage ◽

The Time Domain

Download Full-text

Multichannel Speech Enhancement in the Time Domain

SpringerBriefs in Electrical and Computer Engineering - Canonical Correlation Analysis in Speech Enhancement ◽

10.1007/978-3-319-67020-1_5 ◽

2017 ◽

pp. 59-77

Author(s):

Jacob Benesty ◽

Israel Cohen

Keyword(s):

Speech Enhancement ◽

Time Domain ◽

The Time Domain

Download Full-text

Densely Connected Neural Network with Dilated Convolutions for Real-Time Speech Enhancement in The Time Domain

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp40776.2020.9054536 ◽

2020 ◽

Cited By ~ 1

Author(s):

Ashutosh Pandey ◽

DeLiang Wang

Keyword(s):

Neural Network ◽

Real Time ◽

Speech Enhancement ◽

Time Domain ◽

The Time Domain

Download Full-text

Single-Channel Noise Reduction in the Time Domain

SpringerBriefs in Electrical and Computer Engineering - A Conceptual Framework for Noise Reduction ◽

10.1007/978-3-319-12955-6_3 ◽

2015 ◽

pp. 15-30

Author(s):

Jacob Benesty ◽

Jingdong Chen

Keyword(s):

Noise Reduction ◽

Time Domain ◽

Single Channel ◽

Channel Noise ◽

The Time Domain

Download Full-text

Enhancement of Single-Channel Periodic Signals in the Time-Domain

IEEE Transactions on Audio Speech and Language Processing ◽

10.1109/tasl.2012.2191957 ◽

2012 ◽

Vol 20 (7) ◽

pp. 1948-1963 ◽

Cited By ~ 29

Author(s):

Jesper Rindom Jensen ◽

Jacob Benesty ◽

Mads Græsbøll Christensen ◽

Søren Holdt Jensen

Keyword(s):

Time Domain ◽

Single Channel ◽

Periodic Signals ◽

The Time Domain

Download Full-text

On single-channel noise reduction in the time domain

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp.2011.5946394 ◽

2011 ◽

Cited By ~ 5

Author(s):

Jingdong Chen ◽

Jacob Benesty ◽

Yiteng Huang ◽

Tomas Gaensler

Keyword(s):

Noise Reduction ◽

Time Domain ◽

Single Channel ◽

Channel Noise ◽

The Time Domain

Download Full-text

Preterm-Term Birth Classification Using EMD-Based Time-Domain Features of Single-Channel Electrohysterogram Data

10.21203/rs.3.rs-570938/v1 ◽

2021 ◽

Author(s):

Suparerk Janjarasjitt

Keyword(s):

Preterm Birth ◽

Time Domain ◽

Single Channel ◽

Support Vector ◽

Computational Results ◽

Intrinsic Mode Functions ◽

Term Birth ◽

Emg Signal ◽

Mode Decomposition ◽

The Time Domain

Abstract The preterm birth anticipation is a crucial task that can reduce the rate of preterm birth and also the complications of preterm birth. Electrohysterogram (EHG) or uterine electromyogram (EMG) data have been evidenced that they can provide an information useful for preterm birth anticipation. Four distinct time-domain features, i.e., mean absolute value, average amplitude change, difference absolute standard deviation value, and log detector, commonly applied to EMG signal processing are applied and investigated in this study. A single-channel of EHG data is decomposed into its constituent components, i.e., intrinsic mode functions, using empirical mode decomposition (EMD) before their time-domain features are extracted. The time-domain features of intrinsic mode functions of EHG data associated with preterm and term births are applied for preterm-term birth classification using support vector machine (SVM) with a radial basis function. The preterm-term classifications are validated using 10-fold cross validation. From the computational results, it is shown that the excellent preterm-term birth classification can be achieved using a single-channel of EHG data. The computational results further suggest that the best overall performance on preterm-term birth classification is obtained when thirteen (out of sixteen) EMD-based time-domain features are applied. The best accuracy, sensitivity, specificity, and F1-score achieved are, respectively, 0.9382, 0.9130, 0.9634, and 0.9366.

Download Full-text

Time-Domain Joint Training Strategies of Speech Enhancement and Intent Classification Neural Models

Sensors ◽

10.3390/s22010374 ◽

2022 ◽

Vol 22 (1) ◽

pp. 374

Author(s):

Mohamed Nabih Ali ◽

Daniele Falavigna ◽

Alessio Brutti

Keyword(s):

Speech Enhancement ◽

Time Domain ◽

Background Noise ◽

State Of The Art ◽

Noisy Environments ◽

Neural Models ◽

Convolutional Network ◽

Environmental Perturbations ◽

Front End ◽

The Time Domain

Robustness against background noise and reverberation is essential for many real-world speech-based applications. One way to achieve this robustness is to employ a speech enhancement front-end that, independently of the back-end, removes the environmental perturbations from the target speech signal. However, although the enhancement front-end typically increases the speech quality from an intelligibility perspective, it tends to introduce distortions which deteriorate the performance of subsequent processing modules. In this paper, we investigate strategies for jointly training neural models for both speech enhancement and the back-end, which optimize a combined loss function. In this way, the enhancement front-end is guided by the back-end to provide more effective enhancement. Differently from typical state-of-the-art approaches employing on spectral features or neural embeddings, we operate in the time domain, processing raw waveforms in both components. As application scenario we consider intent classification in noisy environments. In particular, the front-end speech enhancement module is based on Wave-U-Net while the intent classifier is implemented as a temporal convolutional network. Exhaustive experiments are reported on versions of the Fluent Speech Commands corpus contaminated with noises from the Microsoft Scalable Noisy Speech Dataset, shedding light and providing insight about the most promising training approaches.

Download Full-text