A Comparison of Deep Learning Architectures for Automatic Gender Recognition from Audio Signals

Mapping Intimacies ◽

10.5753/eniac.2021.18297 ◽

2021 ◽

Author(s):

Alef Iury S. Ferreira ◽

Frederico S. Oliveira ◽

Nádia F. Felipe da Silva ◽

Anderson S. Soares

Keyword(s):

Deep Learning ◽

Gender Recognition ◽

Audio Signals ◽

Fully Connected ◽

Learning Architectures

O reconhecimento de gênero a partir da fala é um problema relacionado à análise de fala humana, e possui diversas aplicações que vão desde a personalização na recomendação de produtos à ciência forense. A identificação da eficiência e custos de diferentes abordagens que lidam com esse problema é imprescindível. Este trabalho tem como foco investigar e comparar a eficiência e custos de diferentes arquiteturas de deep learning para o reconhecimento de gênero a partir da fala. Os resultados mostram que o modelo convolucional unidimensional consegue os melhores resultados. No entanto, constatou-se que o modelo fully connected apresentou resultados próximos com menor custo, tanto no uso de memória, quanto no tempo de treinamento.

Download Full-text

Cohort selection for clinical trials using deep learning models

Journal of the American Medical Informatics Association ◽

10.1093/jamia/ocz139 ◽

2019 ◽

Vol 26 (11) ◽

pp. 1181-1188 ◽

Cited By ~ 7

Author(s):

Isabel Segura-Bedmar ◽

Pablo Raez

Keyword(s):

Neural Network ◽

Clinical Trials ◽

Deep Learning ◽

Statistical Significance ◽

Hybrid Architecture ◽

Cohort Selection ◽

Selection For ◽

Fully Connected ◽

The Impact ◽

Learning Architectures

Abstract Objective The goal of the 2018 n2c2 shared task on cohort selection for clinical trials (track 1) is to identify which patients meet the selection criteria for clinical trials. Cohort selection is a particularly demanding task to which natural language processing and deep learning can make a valuable contribution. Our goal is to evaluate several deep learning architectures to deal with this task. Materials and Methods Cohort selection can be formulated as a multilabeling problem whose goal is to determine which criteria are met for each patient record. We explore several deep learning architectures such as a simple convolutional neural network (CNN), a deep CNN, a recurrent neural network (RNN), and CNN-RNN hybrid architecture. Although our architectures are similar to those proposed in existing deep learning systems for text classification, our research also studies the impact of using a fully connected feedforward layer on the performance of these architectures. Results The RNN and hybrid models provide the best results, though without statistical significance. The use of the fully connected feedforward layer improves the results for all the architectures, except for the hybrid architecture. Conclusions Despite the limited size of the dataset, deep learning methods show promising results in learning useful features for the task of cohort selection. Therefore, they can be used as a previous filter for cohort selection for any clinical trial with a minimum of human intervention, thus reducing the cost and time of clinical trials significantly.

Download Full-text

Image based flower species classification using CNN

Journal of Innovations in Engineering Education ◽

10.3126/jiee.v2i1.36670 ◽

2019 ◽

Vol 2 (1) ◽

pp. 182-186

Author(s):

Santosh Giri

Keyword(s):

Deep Learning ◽

Image Classification ◽

Training System ◽

Species Classification ◽

Testing Dataset ◽

Machine Learning Applications ◽

Testing Accuracy ◽

Feature Values ◽

Fully Connected ◽

Learning Architectures

Deep learning is one of the essential parts of machine learning. Applications such as image classification, text recognition, object detection etc. used deep learning architectures. In this paper neural network model was designed for image classification. A NN classifier with one fully connected layer and one softmax layer was designed and feature extraction part of inception v3 model was reused to calculate the feature value of each images. And by using these feature values the NN classifier was trained. By adopting transfer learning mechanism NN classifier was trained with 17 classes of oxford 17 flower image dataset. The system provided final training accuracy of 99 %. After training, system was evaluated with testing dataset images. The mean testing accuracy was 86.4%.

Download Full-text

Evaluating reproducibility of AI algorithms in digital pathology with DAPPER

10.1101/340646 ◽

2018 ◽

Cited By ~ 2

Author(s):

Andrea Bizzego ◽

Nicole Bussola ◽

Marco Chierici ◽

Marco Cristoforetti ◽

Margherita Francescatto ◽

...

Keyword(s):

Deep Learning ◽

Digital Pathology ◽

Predictive Biomarkers ◽

Tissue Expression ◽

Benchmark Dataset ◽

Deep Learning Features ◽

Fully Connected ◽

Learning Architectures ◽

Whole Slide Images ◽

Data Analysis Plan

AbstractArtificial Intelligence is exponentially increasing its impact on healthcare. As deep learning is mastering computer vision tasks, its application to digital pathology is natural, with the promise of aiding in routine reporting and standardizing results across trials. Deep learning features inferred from digital pathology scans can improve validity and robustness of current clinico-pathological features, up to identifying novel histological patterns, e.g. from tumor infiltrating lymphocytes. In this study, we examine the issue of evaluating accuracy of predictive models from deep learning features in digital pathology, as an hallmark of reproducibility. We introduce the DAPPER framework for validation based on a rigorous Data Analysis Plan derived from the FDA’s MAQC project, designed to analyse causes of variability in predictive biomarkers. We apply the framework on models that identify tissue of origin on 787 Whole Slide Images from the Genotype-Tissue Expression (GTEx) project. We test 3 different deep learning architectures (VGG, ResNet, Inception) as feature extractors and three classifiers (a fully connected multilayer, Support Vector Machine and Random Forests) and work with 4 datasets (5, 10, 20 or 30 classes), for a total 53000 tiles at 512 × 512 resolution. We analyze accuracy and feature stability of the machine learning classifiers, also demonstrating the need for random features and random labels diagnostic tests to identify selection bias and risks for reproducibility. Further, we use the deep features from the VGG model from GTEx on the KIMIA24 dataset for identification of slide of origin (24 classes) to train a classifier on 1060 annotated tiles and validated on 265 unseen ones. The DAPPER software, including its deep learning backbone pipeline and the HINT (Histological Imaging - Newsy Tiles) benchmark dataset derived from GTEx, is released as a basis for standardization and validation initiatives in AI for Digital Pathology.Author summaryIn this study, we examine the issue of evaluating accuracy of predictive models from deep learning features in digital pathology, as an hallmark of reproducibility. It is indeed a top priority that reproducibility-by-design gets adopted as standard practice in building and validating AI methods in the healthcare domain. Here we introduce DAPPER, a first framework to evaluate deep features and classifiers in digital pathology, based on a rigorous data analysis plan originally developed in the FDA’s MAQC initiative for predictive biomarkers from massive omics data. We apply DAPPER on models trained to identify tissue of origin from the HINT benchmark dataset of 53000 tiles from 787 Whole Slide Images in the Genotype-Tissue Expression (GTEx) project. We analyze accuracy and feature stability of different deep learning architectures (VGG, ResNet and Inception) as feature extractors and classifiers (a fully connected multilayer, SVMs and Random Forests) on up to 20 classes. Further, we use the deep features from the VGG model (trained on HINT) on the 1300 annotated tiles of the KIMIA24 dataset for identification of slide of origin (24 classes). The DAPPER software is available together with the HINT benchmark dataset.

Download Full-text

A Generalization Performance Study Using Deep Learning Networks in Embedded Systems

Sensors ◽

10.3390/s21041031 ◽

2021 ◽

Vol 21 (4) ◽

pp. 1031

Author(s):

Joseba Gorospe ◽

Rubén Mulero ◽

Olatz Arbelaitz ◽

Javier Muguerza ◽

Miguel Ángel Antón

Keyword(s):

Deep Learning ◽

Embedded Systems ◽

Embedded System ◽

General Purpose ◽

Learning Networks ◽

Performance Study ◽

Learning Techniques ◽

Wide Range ◽

Learning Architectures

Deep learning techniques are being increasingly used in the scientific community as a consequence of the high computational capacity of current systems and the increase in the amount of data available as a result of the digitalisation of society in general and the industrial world in particular. In addition, the immersion of the field of edge computing, which focuses on integrating artificial intelligence as close as possible to the client, makes it possible to implement systems that act in real time without the need to transfer all of the data to centralised servers. The combination of these two concepts can lead to systems with the capacity to make correct decisions and act based on them immediately and in situ. Despite this, the low capacity of embedded systems greatly hinders this integration, so the possibility of being able to integrate them into a wide range of micro-controllers can be a great advantage. This paper contributes with the generation of an environment based on Mbed OS and TensorFlow Lite to be embedded in any general purpose embedded system, allowing the introduction of deep learning architectures. The experiments herein prove that the proposed system is competitive if compared to other commercial systems.

Download Full-text

Automatic Segmentation of Choroid Layer Using Deep Learning on Spectral Domain Optical Coherence Tomography

Applied Sciences ◽

10.3390/app11125488 ◽

2021 ◽

Vol 11 (12) ◽

pp. 5488

Author(s):

Wei Ping Hsia ◽

Siu Lun Tse ◽

Chia Jen Chang ◽

Yu Len Huang

Keyword(s):

Optical Coherence Tomography ◽

Deep Learning ◽

Choroidal Thickness ◽

Automatic Segmentation ◽

Average Error ◽

Good Prediction ◽

Optical Coherence ◽

Learning Method ◽

Subfoveal Choroidal Thickness ◽

Fully Connected

The purpose of this article is to evaluate the accuracy of the optical coherence tomography (OCT) measurement of choroidal thickness in healthy eyes using a deep-learning method with the Mask R-CNN model. Thirty EDI-OCT of thirty patients were enrolled. A mask region-based convolutional neural network (Mask R-CNN) model composed of deep residual network (ResNet) and feature pyramid networks (FPNs) with standard convolution and fully connected heads for mask and box prediction, respectively, was used to automatically depict the choroid layer. The average choroidal thickness and subfoveal choroidal thickness were measured. The results of this study showed that ResNet 50 layers deep (R50) model and ResNet 101 layers deep (R101). R101 U R50 (OR model) demonstrated the best accuracy with an average error of 4.85 pixels and 4.86 pixels, respectively. The R101 ∩ R50 (AND model) took the least time with an average execution time of 4.6 s. Mask-RCNN models showed a good prediction rate of choroidal layer with accuracy rates of 90% and 89.9% for average choroidal thickness and average subfoveal choroidal thickness, respectively. In conclusion, the deep-learning method using the Mask-RCNN model provides a faster and accurate measurement of choroidal thickness. Comparing with manual delineation, it provides better effectiveness, which is feasible for clinical application and larger scale of research on choroid.

Download Full-text

Chest x-ray automated triage: a semiologic approach designed for clinical implementation, exploiting different types of labels through a combination of four Deep Learning architectures.

Computer Methods and Programs in Biomedicine ◽

10.1016/j.cmpb.2021.106130 ◽

2021 ◽

pp. 106130

Author(s):

Candelaria Mosquera ◽

Facundo Nahuel Diaz ◽

Fernando Binder ◽

José Martín Rabellino ◽

Sonia Elizabeth Benitez ◽

...

Keyword(s):

Deep Learning ◽

Clinical Implementation ◽

X Ray ◽

Different Types ◽

Chest X Ray ◽

Learning Architectures

Download Full-text

Multimodal Deep Learning and Visible-Light and Hyperspectral Imaging for Fruit Maturity Estimation

Sensors ◽

10.3390/s21041288 ◽

2021 ◽

Vol 21 (4) ◽

pp. 1288

Author(s):

Cinmayii A. Garillos-Manliguez ◽

John Y. Chiang

Keyword(s):

Deep Learning ◽

Visible Light ◽

Hyperspectral Imaging ◽

Morphological Changes ◽

Consumer Preference ◽

Hyperspectral Data ◽

Sensitivity Analyses ◽

Deep Convolutional Neural Networks ◽

Fruit Maturity ◽

Learning Architectures

Fruit maturity is a critical factor in the supply chain, consumer preference, and agriculture industry. Most classification methods on fruit maturity identify only two classes: ripe and unripe, but this paper estimates six maturity stages of papaya fruit. Deep learning architectures have gained respect and brought breakthroughs in unimodal processing. This paper suggests a novel non-destructive and multimodal classification using deep convolutional neural networks that estimate fruit maturity by feature concatenation of data acquired from two imaging modes: visible-light and hyperspectral imaging systems. Morphological changes in the sample fruits can be easily measured with RGB images, while spectral signatures that provide high sensitivity and high correlation with the internal properties of fruits can be extracted from hyperspectral images with wavelength range in between 400 nm and 900 nm—factors that must be considered when building a model. This study further modified the architectures: AlexNet, VGG16, VGG19, ResNet50, ResNeXt50, MobileNet, and MobileNetV2 to utilize multimodal data cubes composed of RGB and hyperspectral data for sensitivity analyses. These multimodal variants can achieve up to 0.90 F1 scores and 1.45% top-2 error rate for the classification of six stages. Overall, taking advantage of multimodal input coupled with powerful deep convolutional neural network models can classify fruit maturity even at refined levels of six stages. This indicates that multimodal deep learning architectures and multimodal imaging have great potential for real-time in-field fruit maturity estimation that can help estimate optimal harvest time and other in-field industrial applications.

Download Full-text

Digit Recognition Applied to Reconstructed Audio Signals Using Deep Learning

2020 25th International Conference on Pattern Recognition (ICPR) ◽

10.1109/icpr48806.2021.9413183 ◽

2021 ◽

Author(s):

Anastasia-Sotiria Toufa ◽

Constantine Kotropoulos

Keyword(s):

Deep Learning ◽

Audio Signals ◽

Digit Recognition

Download Full-text

SMVNet: Deep Learning Architectures for Accurate and Robust Multi-View Stereopsis

2020 IEEE Applied Imagery Pattern Recognition Workshop (AIPR) ◽

10.1109/aipr50011.2020.9425188 ◽

2020 ◽

Author(s):

Shizeng Yao ◽

Yangyang Wang ◽

Hadi AliAkbarpour ◽

Guna Seetharaman ◽

Raghuveer Rao ◽

...

Keyword(s):

Deep Learning ◽

Learning Architectures

Download Full-text

Classification of Shoulder X-ray Images with Deep Learning Ensemble Models

Applied Sciences ◽

10.3390/app11062723 ◽

2021 ◽

Vol 11 (6) ◽

pp. 2723

Author(s):

Fatih Uysal ◽

Fırat Hardalaç ◽

Ozan Peker ◽

Tolga Tolunay ◽

Nil Tokgöz

Keyword(s):

Deep Learning ◽

Performance Test ◽

The Body ◽

Test Accuracy ◽

Cohen’S Kappa ◽

X Ray ◽

Cohen's Kappa ◽

Auc Value ◽

Magnetic Resonance Imaging Mri ◽

Fully Connected

Fractures occur in the shoulder area, which has a wider range of motion than other joints in the body, for various reasons. To diagnose these fractures, data gathered from X-radiation (X-ray), magnetic resonance imaging (MRI), or computed tomography (CT) are used. This study aims to help physicians by classifying shoulder images taken from X-ray devices as fracture/non-fracture with artificial intelligence. For this purpose, the performances of 26 deep learning-based pre-trained models in the detection of shoulder fractures were evaluated on the musculoskeletal radiographs (MURA) dataset, and two ensemble learning models (EL1 and EL2) were developed. The pre-trained models used are ResNet, ResNeXt, DenseNet, VGG, Inception, MobileNet, and their spinal fully connected (Spinal FC) versions. In the EL1 and EL2 models developed using pre-trained models with the best performance, test accuracy was 0.8455, 0.8472, Cohen’s kappa was 0.6907, 0.6942 and the area that was related with fracture class under the receiver operating characteristic (ROC) curve (AUC) was 0.8862, 0.8695. As a result of 28 different classifications in total, the highest test accuracy and Cohen’s kappa values were obtained in the EL2 model, and the highest AUC value was obtained in the EL1 model.

Download Full-text