scholarly journals A Comparison of Deep Learning Architectures for Automatic Gender Recognition from Audio Signals

2021 ◽  
Author(s):  
Alef Iury S. Ferreira ◽  
Frederico S. Oliveira ◽  
Nádia F. Felipe da Silva ◽  
Anderson S. Soares

O reconhecimento de gênero a partir da fala é um problema relacionado à análise de fala humana, e possui diversas aplicações que vão desde a personalização na recomendação de produtos à ciência forense. A identificação da eficiência e custos de diferentes abordagens que lidam com esse problema é imprescindível. Este trabalho tem como foco investigar e comparar a eficiência e custos de diferentes arquiteturas de deep learning para o reconhecimento de gênero a partir da fala. Os resultados mostram que o modelo convolucional unidimensional consegue os melhores resultados. No entanto, constatou-se que o modelo fully connected apresentou resultados próximos com menor custo, tanto no uso de memória, quanto no tempo de treinamento.

2019 ◽  
Vol 26 (11) ◽  
pp. 1181-1188 ◽  
Author(s):  
Isabel Segura-Bedmar ◽  
Pablo Raez

Abstract Objective The goal of the 2018 n2c2 shared task on cohort selection for clinical trials (track 1) is to identify which patients meet the selection criteria for clinical trials. Cohort selection is a particularly demanding task to which natural language processing and deep learning can make a valuable contribution. Our goal is to evaluate several deep learning architectures to deal with this task. Materials and Methods Cohort selection can be formulated as a multilabeling problem whose goal is to determine which criteria are met for each patient record. We explore several deep learning architectures such as a simple convolutional neural network (CNN), a deep CNN, a recurrent neural network (RNN), and CNN-RNN hybrid architecture. Although our architectures are similar to those proposed in existing deep learning systems for text classification, our research also studies the impact of using a fully connected feedforward layer on the performance of these architectures. Results The RNN and hybrid models provide the best results, though without statistical significance. The use of the fully connected feedforward layer improves the results for all the architectures, except for the hybrid architecture. Conclusions Despite the limited size of the dataset, deep learning methods show promising results in learning useful features for the task of cohort selection. Therefore, they can be used as a previous filter for cohort selection for any clinical trial with a minimum of human intervention, thus reducing the cost and time of clinical trials significantly.


2019 ◽  
Vol 2 (1) ◽  
pp. 182-186
Author(s):  
Santosh Giri

Deep learning is one of the essential parts of machine learning. Applications such as image classification, text recognition, object detection etc. used deep learning architectures. In this paper neural network model was designed for image classification. A NN classifier with one fully connected layer and one softmax layer was designed and feature extraction part of inception v3 model was reused to calculate the feature value of each images. And by using these feature values the NN classifier was trained. By adopting transfer learning mechanism NN classifier was trained with 17 classes of oxford 17 flower image dataset. The system provided final training accuracy of 99 %. After training, system was evaluated with testing dataset images. The mean testing accuracy was 86.4%.


2018 ◽  
Author(s):  
Andrea Bizzego ◽  
Nicole Bussola ◽  
Marco Chierici ◽  
Marco Cristoforetti ◽  
Margherita Francescatto ◽  
...  

AbstractArtificial Intelligence is exponentially increasing its impact on healthcare. As deep learning is mastering computer vision tasks, its application to digital pathology is natural, with the promise of aiding in routine reporting and standardizing results across trials. Deep learning features inferred from digital pathology scans can improve validity and robustness of current clinico-pathological features, up to identifying novel histological patterns, e.g. from tumor infiltrating lymphocytes. In this study, we examine the issue of evaluating accuracy of predictive models from deep learning features in digital pathology, as an hallmark of reproducibility. We introduce the DAPPER framework for validation based on a rigorous Data Analysis Plan derived from the FDA’s MAQC project, designed to analyse causes of variability in predictive biomarkers. We apply the framework on models that identify tissue of origin on 787 Whole Slide Images from the Genotype-Tissue Expression (GTEx) project. We test 3 different deep learning architectures (VGG, ResNet, Inception) as feature extractors and three classifiers (a fully connected multilayer, Support Vector Machine and Random Forests) and work with 4 datasets (5, 10, 20 or 30 classes), for a total 53000 tiles at 512 × 512 resolution. We analyze accuracy and feature stability of the machine learning classifiers, also demonstrating the need for random features and random labels diagnostic tests to identify selection bias and risks for reproducibility. Further, we use the deep features from the VGG model from GTEx on the KIMIA24 dataset for identification of slide of origin (24 classes) to train a classifier on 1060 annotated tiles and validated on 265 unseen ones. The DAPPER software, including its deep learning backbone pipeline and the HINT (Histological Imaging - Newsy Tiles) benchmark dataset derived from GTEx, is released as a basis for standardization and validation initiatives in AI for Digital Pathology.Author summaryIn this study, we examine the issue of evaluating accuracy of predictive models from deep learning features in digital pathology, as an hallmark of reproducibility. It is indeed a top priority that reproducibility-by-design gets adopted as standard practice in building and validating AI methods in the healthcare domain. Here we introduce DAPPER, a first framework to evaluate deep features and classifiers in digital pathology, based on a rigorous data analysis plan originally developed in the FDA’s MAQC initiative for predictive biomarkers from massive omics data. We apply DAPPER on models trained to identify tissue of origin from the HINT benchmark dataset of 53000 tiles from 787 Whole Slide Images in the Genotype-Tissue Expression (GTEx) project. We analyze accuracy and feature stability of different deep learning architectures (VGG, ResNet and Inception) as feature extractors and classifiers (a fully connected multilayer, SVMs and Random Forests) on up to 20 classes. Further, we use the deep features from the VGG model (trained on HINT) on the 1300 annotated tiles of the KIMIA24 dataset for identification of slide of origin (24 classes). The DAPPER software is available together with the HINT benchmark dataset.


Sensors ◽  
2021 ◽  
Vol 21 (4) ◽  
pp. 1031
Author(s):  
Joseba Gorospe ◽  
Rubén Mulero ◽  
Olatz Arbelaitz ◽  
Javier Muguerza ◽  
Miguel Ángel Antón

Deep learning techniques are being increasingly used in the scientific community as a consequence of the high computational capacity of current systems and the increase in the amount of data available as a result of the digitalisation of society in general and the industrial world in particular. In addition, the immersion of the field of edge computing, which focuses on integrating artificial intelligence as close as possible to the client, makes it possible to implement systems that act in real time without the need to transfer all of the data to centralised servers. The combination of these two concepts can lead to systems with the capacity to make correct decisions and act based on them immediately and in situ. Despite this, the low capacity of embedded systems greatly hinders this integration, so the possibility of being able to integrate them into a wide range of micro-controllers can be a great advantage. This paper contributes with the generation of an environment based on Mbed OS and TensorFlow Lite to be embedded in any general purpose embedded system, allowing the introduction of deep learning architectures. The experiments herein prove that the proposed system is competitive if compared to other commercial systems.


2021 ◽  
Vol 11 (12) ◽  
pp. 5488
Author(s):  
Wei Ping Hsia ◽  
Siu Lun Tse ◽  
Chia Jen Chang ◽  
Yu Len Huang

The purpose of this article is to evaluate the accuracy of the optical coherence tomography (OCT) measurement of choroidal thickness in healthy eyes using a deep-learning method with the Mask R-CNN model. Thirty EDI-OCT of thirty patients were enrolled. A mask region-based convolutional neural network (Mask R-CNN) model composed of deep residual network (ResNet) and feature pyramid networks (FPNs) with standard convolution and fully connected heads for mask and box prediction, respectively, was used to automatically depict the choroid layer. The average choroidal thickness and subfoveal choroidal thickness were measured. The results of this study showed that ResNet 50 layers deep (R50) model and ResNet 101 layers deep (R101). R101 U R50 (OR model) demonstrated the best accuracy with an average error of 4.85 pixels and 4.86 pixels, respectively. The R101 ∩ R50 (AND model) took the least time with an average execution time of 4.6 s. Mask-RCNN models showed a good prediction rate of choroidal layer with accuracy rates of 90% and 89.9% for average choroidal thickness and average subfoveal choroidal thickness, respectively. In conclusion, the deep-learning method using the Mask-RCNN model provides a faster and accurate measurement of choroidal thickness. Comparing with manual delineation, it provides better effectiveness, which is feasible for clinical application and larger scale of research on choroid.


Sensors ◽  
2021 ◽  
Vol 21 (4) ◽  
pp. 1288
Author(s):  
Cinmayii A. Garillos-Manliguez ◽  
John Y. Chiang

Fruit maturity is a critical factor in the supply chain, consumer preference, and agriculture industry. Most classification methods on fruit maturity identify only two classes: ripe and unripe, but this paper estimates six maturity stages of papaya fruit. Deep learning architectures have gained respect and brought breakthroughs in unimodal processing. This paper suggests a novel non-destructive and multimodal classification using deep convolutional neural networks that estimate fruit maturity by feature concatenation of data acquired from two imaging modes: visible-light and hyperspectral imaging systems. Morphological changes in the sample fruits can be easily measured with RGB images, while spectral signatures that provide high sensitivity and high correlation with the internal properties of fruits can be extracted from hyperspectral images with wavelength range in between 400 nm and 900 nm—factors that must be considered when building a model. This study further modified the architectures: AlexNet, VGG16, VGG19, ResNet50, ResNeXt50, MobileNet, and MobileNetV2 to utilize multimodal data cubes composed of RGB and hyperspectral data for sensitivity analyses. These multimodal variants can achieve up to 0.90 F1 scores and 1.45% top-2 error rate for the classification of six stages. Overall, taking advantage of multimodal input coupled with powerful deep convolutional neural network models can classify fruit maturity even at refined levels of six stages. This indicates that multimodal deep learning architectures and multimodal imaging have great potential for real-time in-field fruit maturity estimation that can help estimate optimal harvest time and other in-field industrial applications.


Author(s):  
Shizeng Yao ◽  
Yangyang Wang ◽  
Hadi AliAkbarpour ◽  
Guna Seetharaman ◽  
Raghuveer Rao ◽  
...  

2021 ◽  
Vol 11 (6) ◽  
pp. 2723
Author(s):  
Fatih Uysal ◽  
Fırat Hardalaç ◽  
Ozan Peker ◽  
Tolga Tolunay ◽  
Nil Tokgöz

Fractures occur in the shoulder area, which has a wider range of motion than other joints in the body, for various reasons. To diagnose these fractures, data gathered from X-radiation (X-ray), magnetic resonance imaging (MRI), or computed tomography (CT) are used. This study aims to help physicians by classifying shoulder images taken from X-ray devices as fracture/non-fracture with artificial intelligence. For this purpose, the performances of 26 deep learning-based pre-trained models in the detection of shoulder fractures were evaluated on the musculoskeletal radiographs (MURA) dataset, and two ensemble learning models (EL1 and EL2) were developed. The pre-trained models used are ResNet, ResNeXt, DenseNet, VGG, Inception, MobileNet, and their spinal fully connected (Spinal FC) versions. In the EL1 and EL2 models developed using pre-trained models with the best performance, test accuracy was 0.8455, 0.8472, Cohen’s kappa was 0.6907, 0.6942 and the area that was related with fracture class under the receiver operating characteristic (ROC) curve (AUC) was 0.8862, 0.8695. As a result of 28 different classifications in total, the highest test accuracy and Cohen’s kappa values were obtained in the EL2 model, and the highest AUC value was obtained in the EL1 model.


Sign in / Sign up

Export Citation Format

Share Document