Unsegmented Heart Sound Classification Using Hybrid CNN-LSTM Neural Networks

Abstract Background: The application of machine learning to cardiac auscultation has the potential to improve the accuracy and efficiency of both routine and point-of-care screenings. The use of Convolutional Neural Networks (CNN) on heart sound spectrograms in particular has defined state-of-the-art performance. However, the relative paucity of patient data remains a significant barrier to creating models that can adapt to the wide range of between-subject variability. To that end, we examined a CNN model’s performance on automated heart sound classification, before and after various forms of data augmentation, and aimed to identify the most optimal augmentation methods for cardiac spectrogram analysis.Results: We built a standard CNN model to classify cardiac sound recordings as either normal or abnormal. The baseline control model achieved an ROC AUC of 0.945±0.016. Among the data augmentation techniques explored, horizontal flipping of the spectrogram image improved the model performance the most, with an ROC AUC of 0.957±0.009. Principal component analysis color augmentation (PCA) and perturbations of saturation-value (SV) of the hue-saturation-value (HSV) color scale achieved an ROC AUC of 0.949±0.014 and 0.946±0.019, respectively. Time and frequency masking resulted in an ROC AUC of 0.948±0.012. Pitch shifting, time stretching and compressing, noise injection, vertical flipping, and applying random color filters all negatively impacted model performance.Conclusion: Data augmentation can improve classification accuracy by expanding and diversifying the dataset, which protects against overfitting to random variance. However, data augmentation is necessarily domain specific. For example, methods like noise injection have found success in other areas of automated sound classification, but in the context of cardiac sound analysis, noise injection can mimic the presence of murmurs and worsen model performance. Thus, care should be taken to ensure clinically appropriate forms of data augmentation to avoid negatively impacting model performance.

Download Full-text

Rethinking environmental sound classification using convolutional neural networks: optimized parameter tuning of single feature extraction

Neural Computing and Applications ◽

10.1007/s00521-021-06091-7 ◽

2021 ◽

Author(s):

Yousef Abd Al-Hattab ◽

Hasan Firdaus Zaki ◽

Amir Akramin Shafie

Keyword(s):

Neural Networks ◽

Feature Extraction ◽

Convolutional Neural Networks ◽

Parameter Tuning ◽

Environmental Sound ◽

Sound Classification ◽

Single Feature

Download Full-text

Convolutional Neural Networks for Scops Owl Sound Classification

Procedia Computer Science ◽

10.1016/j.procs.2020.12.010 ◽

2021 ◽

Vol 179 ◽

pp. 81-87

Author(s):

Alam Ahmad Hidayat ◽

Tjeng Wawan Cenggoro ◽

Bens Pardamean

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Sound Classification

Download Full-text

A Method of Environmental Sound Classification Based on Residual Networks and Data Augmentation

International Journal of Computational Intelligence and Applications ◽

10.1142/s1469026821500188 ◽

2021 ◽

pp. 2150018

Author(s):

Jinfang Zeng ◽

Youming Li ◽

Yu Zhang ◽

Da Chen

Keyword(s):

Neural Networks ◽

Data Augmentation ◽

Machine Learning Techniques ◽

Deep Convolutional Neural Networks ◽

Mel Frequency Cepstral Coefficients ◽

Environmental Sound ◽

Sound Classification ◽

Learning Techniques ◽

Proposed Model ◽

Audio Data

Environmental sound classification (ESC) is a challenging problem due to the complexity of sounds. To date, a variety of signal processing and machine learning techniques have been applied to ESC task, including matrix factorization, dictionary learning, wavelet filterbanks and deep neural networks. It is observed that features extracted from deeper networks tend to achieve higher performance than those extracted from shallow networks. However, in ESC task, only the deep convolutional neural networks (CNNs) which contain several layers are used and the residual networks are ignored, which lead to degradation in the performance. Meanwhile, a possible explanation for the limited exploration of CNNs and the difficulty to improve on simpler models is the relative scarcity of labeled data for ESC. In this paper, a residual network called EnvResNet for the ESC task is proposed. In addition, we propose to use audio data augmentation to overcome the problem of data scarcity. The experiments will be performed on the ESC-50 database. Combined with data augmentation, the proposed model outperforms baseline implementations relying on mel-frequency cepstral coefficients and achieves results comparable to other state-of-the-art approaches in terms of classification accuracy.

Download Full-text

A pattern recognition system for environmental sound classification based on MFCCs and neural networks

10.1109/icspcs.2008.4813723 ◽

2008 ◽

Cited By ~ 9

Author(s):

F. Beritelli ◽

R. Grasso

Keyword(s):

Neural Networks ◽

Pattern Recognition ◽

Recognition System ◽

Environmental Sound ◽

Pattern Recognition System ◽

Sound Classification

Download Full-text

Unsegmented Heart Sound Classification Using Hybrid CNN-LSTM Neural Networks

Short-segment Heart Sound Classification Using an Ensemble of Deep Convolutional Neural Networks

Heart sound classification based on improved MFCC features and convolutional recurrent neural networks

Fundamental Heart Sound Classification using the Continuous Wavelet Transform and Convolutional Neural Networks

Heart sound classification based on log Mel-frequency spectral coefficients features and convolutional neural networks

Review of digital heart sound classification methods via Artificial Neural Networks

On the Analysis of Data Augmentation Methods for Spectral Imaged Based Heart Sound Classification Using Convolutional Neural Networks

Rethinking environmental sound classification using convolutional neural networks: optimized parameter tuning of single feature extraction

Convolutional Neural Networks for Scops Owl Sound Classification

A Method of Environmental Sound Classification Based on Residual Networks and Data Augmentation

A pattern recognition system for environmental sound classification based on MFCCs and neural networks

Export Citation Format