An Interpretable Deep Learning Model for Automatic Sound Classification

Pablo Zinemanas; Martín Rocamora; Marius Miron; Frederic Font; Xavier Serra

doi:10.3390/electronics10070850

An Interpretable Deep Learning Model for Automatic Sound Classification

Electronics ◽

10.3390/electronics10070850 ◽

2021 ◽

Vol 10 (7) ◽

pp. 850

Author(s):

Pablo Zinemanas ◽

Martín Rocamora ◽

Marius Miron ◽

Frederic Font ◽

Xavier Serra

Keyword(s):

Deep Learning ◽

Web Application ◽

Domain Knowledge ◽

Learning Model ◽

Unintended Effects ◽

Learning Models ◽

Time Frequency ◽

Sound Classification ◽

Proposed Model ◽

Deep Learning Model

Deep learning models have improved cutting-edge technologies in many research areas, but their black-box structure makes it difficult to understand their inner workings and the rationale behind their predictions. This may lead to unintended effects, such as being susceptible to adversarial attacks or the reinforcement of biases. There is still a lack of research in the audio domain, despite the increasing interest in developing deep learning models that provide explanations of their decisions. To reduce this gap, we propose a novel interpretable deep learning model for automatic sound classification, which explains its predictions based on the similarity of the input to a set of learned prototypes in a latent space. We leverage domain knowledge by designing a frequency-dependent similarity measure and by considering different time-frequency resolutions in the feature space. The proposed model achieves results that are comparable to that of the state-of-the-art methods in three different sound classification tasks involving speech, music, and environmental audio. In addition, we present two automatic methods to prune the proposed model that exploit its interpretability. Our system is open source and it is accompanied by a web application for the manual editing of the model, which allows for a human-in-the-loop debugging approach.

Download Full-text

Optimising Deep Learning at the Edge for Accurate Hourly Air Quality Prediction

Sensors ◽

10.3390/s21041064 ◽

2021 ◽

Vol 21 (4) ◽

pp. 1064

Author(s):

I Nyoman Kusuma Wardana ◽

Julian W. Gardner ◽

Suhaib A. Fahmy

Keyword(s):

Deep Learning ◽

Air Quality ◽

Short Term Memory ◽

Model Performance ◽

Learning Model ◽

Sensor Data ◽

Raspberry Pi ◽

Learning Models ◽

Proposed Model ◽

Deep Learning Model

Accurate air quality monitoring requires processing of multi-dimensional, multi-location sensor data, which has previously been considered in centralised machine learning models. These are often unsuitable for resource-constrained edge devices. In this article, we address this challenge by: (1) designing a novel hybrid deep learning model for hourly PM2.5 pollutant prediction; (2) optimising the obtained model for edge devices; and (3) examining model performance running on the edge devices in terms of both accuracy and latency. The hybrid deep learning model in this work comprises a 1D Convolutional Neural Network (CNN) and a Long Short-Term Memory (LSTM) to predict hourly PM2.5 concentration. The results show that our proposed model outperforms other deep learning models, evaluated by calculating RMSE and MAE errors. The proposed model was optimised for edge devices, the Raspberry Pi 3 Model B+ (RPi3B+) and Raspberry Pi 4 Model B (RPi4B). This optimised model reduced file size to a quarter of the original, with further size reduction achieved by implementing different post-training quantisation. In total, 8272 hourly samples were continuously fed to the edge device, with the RPi4B executing the model twice as fast as the RPi3B+ in all quantisation modes. Full-integer quantisation produced the lowest execution time, with latencies of 2.19 s and 4.73 s for RPi4B and RPi3B+, respectively.

Download Full-text

An Optimized Hybrid Deep Learning Model to Detect COVID-19 Misleading Information

Computational Intelligence and Neuroscience ◽

10.1155/2021/9615034 ◽

2021 ◽

Vol 2021 ◽

pp. 1-15

Author(s):

Bader Alouffi ◽

Abdullah Alharbi ◽

Radhya Sahal ◽

Hager Saleh

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Short Term Memory ◽

Learning Model ◽

Dense Layer ◽

Fake News ◽

Learning Models ◽

Proposed Model ◽

Deep Learning Model ◽

Machine Learning Models

Fake news is challenging to detect due to mixing accurate and inaccurate information from reliable and unreliable sources. Social media is a data source that is not trustworthy all the time, especially in the COVID-19 outbreak. During the COVID-19 epidemic, fake news is widely spread. The best way to deal with this is early detection. Accordingly, in this work, we have proposed a hybrid deep learning model that uses convolutional neural network (CNN) and long short-term memory (LSTM) to detect COVID-19 fake news. The proposed model consists of some layers: an embedding layer, a convolutional layer, a pooling layer, an LSTM layer, a flatten layer, a dense layer, and an output layer. For experimental results, three COVID-19 fake news datasets are used to evaluate six machine learning models, two deep learning models, and our proposed model. The machine learning models are DT, KNN, LR, RF, SVM, and NB, while the deep learning models are CNN and LSTM. Also, four matrices are used to validate the results: accuracy, precision, recall, and F1-measure. The conducted experiments show that the proposed model outperforms the six machine learning models and the two deep learning models. Consequently, the proposed system is capable of detecting the fake news of COVID-19 significantly.

Download Full-text

Micro Activities Recognition in Uncontrolled Environments

Applied Sciences ◽

10.3390/app112110327 ◽

2021 ◽

Vol 11 (21) ◽

pp. 10327

Author(s):

Ali Abbas ◽

Michael Haslgrübler ◽

Abdul Mannan Dogar ◽

Alois Ferscha

Keyword(s):

Deep Learning ◽

Real World ◽

Image Understanding ◽

Learning Model ◽

Assembly Process ◽

Learning Models ◽

Automated Teller Machines ◽

Proposed Model ◽

And Control ◽

Deep Learning Model

Deep learning has proven to be very useful for the image understanding in efficient manners. Assembly of complex machines is very common in industries. The assembly of automated teller machines (ATM) is one of the examples. There exist deep learning models which monitor and control the assembly process. To the best of our knowledge, there exists no deep learning models for real environments where we have no control over the working style of workers and the sequence of assembly process. In this paper, we presented a modified deep learning model to control the assembly process in a real-world environment. For this study, we have a dataset which was generated in a real-world uncontrolled environment. During the dataset generation, we did not have any control over the sequence of assembly steps. We applied four different states of the art deep learning models to control the assembly of ATM. Due to the nature of uncontrolled environment dataset, we modified the deep learning models to fit for the task. We not only control the sequence, our proposed model will give feedback in case of any missing step in the required workflow. The contributions of this research are accurate anomaly detection in the assembly process in a real environment, modifications in existing deep learning models according to the nature of the data and normalization of the uncontrolled data for the training of deep learning model. The results show that we can generalize and control the sequence of assembly steps, because even in an uncontrolled environment, there are some specific activities, which are repeated over time. If we can recognize and map the micro activities to macro activities, then we can successfully monitor and optimize the assembly process.

Download Full-text

Multi-Modal Evolutionary Deep Learning Model for Ovarian Cancer Diagnosis

Symmetry ◽

10.3390/sym13040643 ◽

2021 ◽

Vol 13 (4) ◽

pp. 643

Author(s):

Rania M. Ghoniem ◽

Abeer D. Algarni ◽

Basel Refky ◽

Ahmed A. Ewees

Keyword(s):

Ovarian Cancer ◽

Deep Learning ◽

Short Term Memory ◽

Computational Cost ◽

Learning Model ◽

Learning Models ◽

Lung Cancers ◽

Proposed Model ◽

Linear Aggregation ◽

Deep Learning Model

Ovarian cancer (OC) is a common reason for mortality among women. Deep learning has recently proven better performance in predicting OC stages and subtypes. However, most of the state-of-the-art deep learning models employ single modality data, which may afford low-level performance due to insufficient representation of important OC characteristics. Furthermore, these deep learning models still lack to the optimization of the model construction, which requires high computational cost to train and deploy them. In this work, a hybrid evolutionary deep learning model, using multi-modal data, is proposed. The established multi-modal fusion framework amalgamates gene modality alongside with histopathological image modality. Based on the different states and forms of each modality, we set up deep feature extraction network, respectively. This includes a predictive antlion-optimized long-short-term-memory model to process gene longitudinal data. Another predictive antlion-optimized convolutional neural network model is included to process histopathology images. The topology of each customized feature network is automatically set by the antlion optimization algorithm to make it realize better performance. After that the output from the two improved networks is fused based upon weighted linear aggregation. The deep fused features are finally used to predict OC stage. A number of assessment indicators was used to compare the proposed model to other nine multi-modal fusion models constructed using distinct evolutionary algorithms. This was conducted using a benchmark for OC and two benchmarks for breast and lung cancers. The results reveal that the proposed model is more precise and accurate in diagnosing OC and the other cancers.

Download Full-text

Performance Comparison of the Deep Learning and the Human Endoscopist for Bleeding Peptic Ulcer Disease

Journal of Medical and Biological Engineering ◽

10.1007/s40846-021-00608-0 ◽

2021 ◽

Author(s):

Hsu-Heng Yen ◽

Ping-Yu Wu ◽

Pei-Yuan Su ◽

Chia-Wei Yang ◽

Yang-Yuan Chen ◽

...

Keyword(s):

Peptic Ulcer ◽

Deep Learning ◽

Sensitivity And Specificity ◽

Endoscopic Therapy ◽

Learning Model ◽

Ulcer Bleeding ◽

Learning Models ◽

Peptic Ulcer Bleeding ◽

Ulcer Disease ◽

Deep Learning Model

Abstract Purpose Management of peptic ulcer bleeding is clinically challenging. Accurate characterization of the bleeding during endoscopy is key for endoscopic therapy. This study aimed to assess whether a deep learning model can aid in the classification of bleeding peptic ulcer disease. Methods Endoscopic still images of patients (n = 1694) with peptic ulcer bleeding for the last 5 years were retrieved and reviewed. Overall, 2289 images were collected for deep learning model training, and 449 images were validated for the performance test. Two expert endoscopists classified the images into different classes based on their appearance. Four deep learning models, including Mobile Net V2, VGG16, Inception V4, and ResNet50, were proposed and pre-trained by ImageNet with the established convolutional neural network algorithm. A comparison of the endoscopists and trained deep learning model was performed to evaluate the model’s performance on a dataset of 449 testing images. Results The results first presented the performance comparisons of four deep learning models. The Mobile Net V2 presented the optimal performance of the proposal models. The Mobile Net V2 was chosen for further comparing the performance with the diagnostic results obtained by one senior and one novice endoscopists. The sensitivity and specificity were acceptable for the prediction of “normal” lesions in both 3-class and 4-class classifications. For the 3-class category, the sensitivity and specificity were 94.83% and 92.36%, respectively. For the 4-class category, the sensitivity and specificity were 95.40% and 92.70%, respectively. The interobserver agreement of the testing dataset of the model was moderate to substantial with the senior endoscopist. The accuracy of the determination of endoscopic therapy required and high-risk endoscopic therapy of the deep learning model was higher than that of the novice endoscopist. Conclusions In this study, the deep learning model performed better than inexperienced endoscopists. Further improvement of the model may aid in clinical decision-making during clinical practice, especially for trainee endoscopist.

Download Full-text

Transfer Learning of The ResNet-18 and DenseNet-121 Model Used to Diagnose Intracranial Hemorrhage in CT Scanning

Current Pharmaceutical Design ◽

10.2174/1381612827666211213143357 ◽

2021 ◽

Vol 27 ◽

Author(s):

Qi Zhou ◽

Wenjie Zhu ◽

Fuchen Li ◽

Mingqing Yuan ◽

Linfeng Zheng ◽

...

Keyword(s):

Deep Learning ◽

Transfer Learning ◽

Intracranial Hemorrhage ◽

Intraventricular Hemorrhage ◽

Learning Model ◽

Normal Group ◽

Subdural Hemorrhage ◽

Model Assessment ◽

Learning Models ◽

Deep Learning Model

Objective: To verify the ability of the deep learning model in identifying five subtypes and normal images in noncontrast enhancement CT of intracranial hemorrhage. Method: A total of 351 patients (39 patients in the normal group, 312 patients in the intracranial hemorrhage group) performed with intracranial hemorrhage noncontrast enhanced CT were selected, with 2768 images in total (514 images for the normal group, 398 images for the epidural hemorrhage group, 501 images for the subdural hemorrhage group, 497 images for the intraventricular hemorrhage group, 415 images for the cerebral parenchymal hemorrhage group, and 443 images for the subarachnoid hemorrhage group). Based on the diagnostic reports of two radiologists with more than 10 years of experience, the ResNet-18 and DenseNet-121 deep learning models were selected. Transfer learning was used. 80% of the data was used for training models, 10% was used for validating model performance against overfitting, and the last 10% was used for the final evaluation of the model. Assessment indicators included accuracy, sensitivity, specificity, and AUC values. Results: The overall accuracy of ResNet-18 and DenseNet-121 models were 89.64% and 82.5%, respectively. The sensitivity and specificity of identifying five subtypes and normal images were above 0.80. The sensitivity of DenseNet-121 model to recognize intraventricular hemorrhage and cerebral parenchymal hemorrhage was lower than 0.80, 0.73, and 0.76 respectively. The AUC values of the two deep learning models were above 0.9. Conclusion: The deep learning model can accurately identify the five subtypes of intracranial hemorrhage and normal images, and it can be used as a new tool for clinical diagnosis in the future.

Download Full-text

A Physics-Infused Deep Learning Model for the Prediction of Refractive Indices and Its Use for the Large-Scale Screening of Organic Compound Space

10.26434/chemrxiv.8796950 ◽

2019 ◽

Author(s):

Mojtaba Haghighatlari ◽

Gaurav Vishwakarma ◽

Mohammad Atif Faiz Afzal ◽

Johannes Hachmann

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Large Scale ◽

Organic Molecules ◽

Learning Model ◽

Training Data ◽

Refractive Indices ◽

Learning Models ◽

Deep Learning Model ◽

Machine Learning Models

<div><div><div><p>We present a multitask, physics-infused deep learning model to accurately and efficiently predict refractive indices (RIs) of organic molecules, and we apply it to a library of 1.5 million compounds. We show that it outperforms earlier machine learning models by a significant margin, and that incorporating known physics into data-derived models provides valuable guardrails. Using a transfer learning approach, we augment the model to reproduce results consistent with higher-level computational chemistry training data, but with a considerably reduced number of corresponding calculations. Prediction errors of machine learning models are typically smallest for commonly observed target property values, consistent with the distribution of the training data. However, since our goal is to identify candidates with unusually large RI values, we propose a strategy to boost the performance of our model in the remoter areas of the RI distribution: We bias the model with respect to the under-represented classes of molecules that have values in the high-RI regime. By adopting a metric popular in web search engines, we evaluate our effectiveness in ranking top candidates. We confirm that the models developed in this study can reliably predict the RIs of the top 1,000 compounds, and are thus able to capture their ranking. We believe that this is the first study to develop a data-derived model that ensures the reliability of RI predictions by model augmentation in the extrapolation region on such a large scale. These results underscore the tremendous potential of machine learning in facilitating molecular (hyper)screening approaches on a massive scale and in accelerating the discovery of new compounds and materials, such as organic molecules with high-RI for applications in opto-electronics.</p></div></div></div>

Download Full-text

Abstract P319: Can Deep Learning Find the Ischemic Core on CT? Transfer Learning From Pre-Trained MRI-Based Networks

Stroke ◽

10.1161/str.52.suppl_1.p319 ◽

2021 ◽

Vol 52 (Suppl_1) ◽

Author(s):

Yannan Yu ◽

Soren Christensen ◽

Yuan Xie ◽

Enhao Gong ◽

Maarten G Lansberg ◽

...

Keyword(s):

Deep Learning ◽

Ground Truth ◽

Learning Model ◽

Fine Tuning ◽

Learning Models ◽

Starting Point ◽

Stroke Lesion ◽

Ischemic Core ◽

Deep Learning Model

Objective: Ischemic core prediction from CT perfusion (CTP) remains inaccurate compared with gold standard diffusion-weighted imaging (DWI). We evaluated if a deep learning model to predict the DWI lesion from MR perfusion (MRP) could facilitate ischemic core prediction on CTP. Method: Using the multi-center CRISP cohort of acute ischemic stroke patient with CTP before thrombectomy, we included patients with major reperfusion (TICI score≥2b), adequate image quality, and follow-up MRI at 3-7 days. Perfusion parameters including Tmax, mean transient time, cerebral blood flow (CBF), and cerebral blood volume were reconstructed by RAPID software. Core lab experts outlined the stroke lesion on the follow-up MRI. A previously trained MRI model in a separate group of patients was used as a starting point, which used MRP parameters as input and RAPID ischemic core on DWI as ground truth. We fine-tuned this model, using CTP parameters as input, and follow-up MRI as ground truth. Another model was also trained from scratch with only CTP data. 5-fold cross validation was used. Performance of the models was compared with ischemic core (rCBF≤30%) from RAPID software to identify the presence of a large infarct (volume>70 or >100ml). Results: 94 patients in the CRISP trial met the inclusion criteria (mean age 67±15 years, 52% male, median baseline NIHSS 18, median 90-day mRS 2). Without fine-tuning, the MRI model had an agreement of 73% in infarct >70ml, and 69% in >100ml; the MRI model fine-tuned on CT improved the agreement to 77% and 73%; The CT model trained from scratch had agreements of 73% and 71%; All of the deep learning models outperformed the rCBF segmentation from RAPID, which had agreements of 51% and 64%. See Table and figure. Conclusions: It is feasible to apply MRP-based deep learning model to CT. Fine-tuning with CTP data further improves the predictions. All deep learning models predict the stroke lesion after major recanalization better than thresholding approaches based on rCBF.

Download Full-text

Content Noise Detection Model Using Deep Learning in Web Forums

Sustainability ◽

10.3390/su12125074 ◽

2020 ◽

Vol 12 (12) ◽

pp. 5074

Author(s):

Jiyoung Woo ◽

Jaeseok Yun

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Learning Model ◽

Detection Model ◽

Proposed Model ◽

Web Forum ◽

Web Forums ◽

Conventional Machine ◽

Text Features ◽

Deep Learning Model

Spam posts in web forum discussions cause user inconvenience and lower the value of the web forum as an open source of user opinion. In this regard, as the importance of a web post is evaluated in terms of the number of involved authors, noise distorts the analysis results by adding unnecessary data to the opinion analysis. Here, in this work, an automatic detection model for spam posts in web forums using both conventional machine learning and deep learning is proposed. To automatically differentiate between normal posts and spam, evaluators were asked to recognize spam posts in advance. To construct the machine learning-based model, text features from posted content using text mining techniques from the perspective of linguistics were extracted, and supervised learning was performed to distinguish content noise from normal posts. For the deep learning model, raw text including and excluding special characters was utilized. A comparison analysis on deep neural networks using the two different recurrent neural network (RNN) models of the simple RNN and long short-term memory (LSTM) network was also performed. Furthermore, the proposed model was applied to two web forums. The experimental results indicate that the deep learning model affords significant improvements over the accuracy of conventional machine learning associated with text features. The accuracy of the proposed model using LSTM reaches 98.56%, and the precision and recall of the noise class reach 99% and 99.53%, respectively.

Download Full-text

ECOVNet: a highly effective ensemble based deep learning model for detecting COVID-19

PeerJ Computer Science ◽

10.7717/peerj-cs.551 ◽

2021 ◽

Vol 7 ◽

pp. e551

Author(s):

Nihad Karim Chowdhury ◽

Muhammad Ashad Kabir ◽

Md. Muhtadir Rahman ◽

Noortaz Rezoana

Keyword(s):

Deep Learning ◽

Detection System ◽

Learning Model ◽

Classification Performance ◽

Data Sets ◽

X Rays ◽

Proposed Model ◽

Highly Effective ◽

Chest X Ray ◽

Deep Learning Model

The goal of this research is to develop and implement a highly effective deep learning model for detecting COVID-19. To achieve this goal, in this paper, we propose an ensemble of Convolutional Neural Network (CNN) based on EfficientNet, named ECOVNet, to detect COVID-19 from chest X-rays. To make the proposed model more robust, we have used one of the largest open-access chest X-ray data sets named COVIDx containing three classes—COVID-19, normal, and pneumonia. For feature extraction, we have applied an effective CNN structure, namely EfficientNet, with ImageNet pre-training weights. The generated features are transferred into custom fine-tuned top layers followed by a set of model snapshots. The predictions of the model snapshots (which are created during a single training) are consolidated through two ensemble strategies, i.e., hard ensemble and soft ensemble, to enhance classification performance. In addition, a visualization technique is incorporated to highlight areas that distinguish classes, thereby enhancing the understanding of primal components related to COVID-19. The results of our empirical evaluations show that the proposed ECOVNet model outperforms the state-of-the-art approaches and significantly improves detection performance with 100% recall for COVID-19 and overall accuracy of 96.07%. We believe that ECOVNet can enhance the detection of COVID-19 disease, and thus, underpin a fully automated and efficacious COVID-19 detection system.

Download Full-text