scholarly journals A Deep Learning Approach for Complex Microstructure Inference

Author(s):  
Ali Durmaz ◽  
Martin Müller ◽  
Bo Lei ◽  
Akhil Thomas ◽  
Dominik Britz ◽  
...  

Abstract Automated, reliable, and objective microstructure inference from micrographs is an essential milestone towards a comprehensive understanding of process-microstructure-property relations and tailored materials development. However, such inference, with the increasing complexity of microstructures, requires advanced segmentation methodologies. While deep learning (DL), in principle, offers new opportunities for this task, an intuition about the required data quality and quantity and an extensive methodological DL guideline for microstructure quantification and classification are still missing. This, along with a lack of open-access data sets and the seemingly intransparent decision-making process of DL models, hampers its breakthrough in this field. We address all aforementioned obstacles by a multidisciplinary DL approach, devoting equal attention to specimen preparation, contrasting, and imaging. To this end, we train distinct U-Net architectures with 30–50 micrographs of different imaging modalities and corresponding EBSD-informed annotations. On the challenging task of lath-bainite segmentation in complex-phase steel, we achieve accuracies of 90% rivaling expert segmentations. Further, we discuss the impact of image context, pre-training with domain-extrinsic data, and data augmentation. Network visualization techniques demonstrate plausible model decisions based on grain boundary morphology and triple points. As a result, we resolve preconceptions about required data amounts and interpretability to pave the way for DL's day-to-day application for microstructure quantification.

2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Ali Riza Durmaz ◽  
Martin Müller ◽  
Bo Lei ◽  
Akhil Thomas ◽  
Dominik Britz ◽  
...  

AbstractAutomated, reliable, and objective microstructure inference from micrographs is essential for a comprehensive understanding of process-microstructure-property relations and tailored materials development. However, such inference, with the increasing complexity of microstructures, requires advanced segmentation methodologies. While deep learning offers new opportunities, an intuition about the required data quality/quantity and a methodological guideline for microstructure quantification is still missing. This, along with deep learning’s seemingly intransparent decision-making process, hampers its breakthrough in this field. We apply a multidisciplinary deep learning approach, devoting equal attention to specimen preparation and imaging, and train distinct U-Net architectures with 30–50 micrographs of different imaging modalities and electron backscatter diffraction-informed annotations. On the challenging task of lath-bainite segmentation in complex-phase steel, we achieve accuracies of 90% rivaling expert segmentations. Further, we discuss the impact of image context, pre-training with domain-extrinsic data, and data augmentation. Network visualization techniques demonstrate plausible model decisions based on grain boundary morphology.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Yahya Albalawi ◽  
Jim Buckley ◽  
Nikola S. Nikolov

AbstractThis paper presents a comprehensive evaluation of data pre-processing and word embedding techniques in the context of Arabic document classification in the domain of health-related communication on social media. We evaluate 26 text pre-processings applied to Arabic tweets within the process of training a classifier to identify health-related tweets. For this task we use the (traditional) machine learning classifiers KNN, SVM, Multinomial NB and Logistic Regression. Furthermore, we report experimental results with the deep learning architectures BLSTM and CNN for the same text classification problem. Since word embeddings are more typically used as the input layer in deep networks, in the deep learning experiments we evaluate several state-of-the-art pre-trained word embeddings with the same text pre-processing applied. To achieve these goals, we use two data sets: one for both training and testing, and another for testing the generality of our models only. Our results point to the conclusion that only four out of the 26 pre-processings improve the classification accuracy significantly. For the first data set of Arabic tweets, we found that Mazajak CBOW pre-trained word embeddings as the input to a BLSTM deep network led to the most accurate classifier with F1 score of 89.7%. For the second data set, Mazajak Skip-Gram pre-trained word embeddings as the input to BLSTM led to the most accurate model with F1 score of 75.2% and accuracy of 90.7% compared to F1 score of 90.8% achieved by Mazajak CBOW for the same architecture but with lower accuracy of 70.89%. Our results also show that the performance of the best of the traditional classifier we trained is comparable to the deep learning methods on the first dataset, but significantly worse on the second dataset.


2020 ◽  
Vol 6 (1) ◽  
Author(s):  
Malte Seemann ◽  
Lennart Bargsten ◽  
Alexander Schlaefer

AbstractDeep learning methods produce promising results when applied to a wide range of medical imaging tasks, including segmentation of artery lumen in computed tomography angiography (CTA) data. However, to perform sufficiently, neural networks have to be trained on large amounts of high quality annotated data. In the realm of medical imaging, annotations are not only quite scarce but also often not entirely reliable. To tackle both challenges, we developed a two-step approach for generating realistic synthetic CTA data for the purpose of data augmentation. In the first step moderately realistic images are generated in a purely numerical fashion. In the second step these images are improved by applying neural domain adaptation. We evaluated the impact of synthetic data on lumen segmentation via convolutional neural networks (CNNs) by comparing resulting performances. Improvements of up to 5% in terms of Dice coefficient and 20% for Hausdorff distance represent a proof of concept that the proposed augmentation procedure can be used to enhance deep learning-based segmentation for artery lumen in CTA images.


2020 ◽  
Vol 496 (3) ◽  
pp. 3553-3571
Author(s):  
Benjamin E Stahl ◽  
Jorge Martínez-Palomera ◽  
WeiKang Zheng ◽  
Thomas de Jaeger ◽  
Alexei V Filippenko ◽  
...  

ABSTRACT We present deepSIP (deep learning of Supernova Ia Parameters), a software package for measuring the phase and – for the first time using deep learning – the light-curve shape of a Type Ia supernova (SN Ia) from an optical spectrum. At its core, deepSIP consists of three convolutional neural networks trained on a substantial fraction of all publicly available low-redshift SN Ia optical spectra, on to which we have carefully coupled photometrically derived quantities. We describe the accumulation of our spectroscopic and photometric data sets, the cuts taken to ensure quality, and our standardized technique for fitting light curves. These considerations yield a compilation of 2754 spectra with photometrically characterized phases and light-curve shapes. Though such a sample is significant in the SN community, it is small by deep-learning standards where networks routinely have millions or even billions of free parameters. We therefore introduce a data-augmentation strategy that meaningfully increases the size of the subset we allocate for training while prioritizing model robustness and telescope agnosticism. We demonstrate the effectiveness of our models by deploying them on a sample unseen during training and hyperparameter selection, finding that Model I identifies spectra that have a phase between −10 and 18 d and light-curve shape, parametrized by Δm15, between 0.85 and 1.55 mag with an accuracy of 94.6 per cent. For those spectra that do fall within the aforementioned region in phase–Δm15 space, Model II predicts phases with a root-mean-square error (RMSE) of 1.00 d and Model III predicts Δm15 values with an RMSE of 0.068 mag.


Author(s):  
Nafiseh Zeinali ◽  
Karim Faez ◽  
Sahar Seifzadeh

Purpose: One of the essential problems in deep-learning face recognition research is the use of self-made and less counted data sets, which forces the researcher to work on duplicate and provided data sets. In this research, we try to resolve this problem and get to high accuracy. Materials and Methods: In the current study, the goal is to identify individual facial expressions in the image or sequence of images that include identifying ten facial expressions. Considering the increasing use of deep learning in recent years, in this study, using the convolution networks and, most importantly, using the concept of transfer learning, led us to use pre-trained networks to train our networks. Results: One way to improve accuracy in working with less counted data and deep-learning is to use pre-trained using pre-trained networks. Due to the small number of data sets, we used the techniques for data augmentation and eventually tripled the data size. These techniques include: rotating 10 degrees to the left and right and eventually turning to elastic transmation. We also applied deep Res-Net's network to public data sets existing for face expression by data augmentation. Conclusion: We saw a seven percent increase in accuracy compared to the highest accuracy in previous work on the considering dataset.


2020 ◽  
Vol 28 (1) ◽  
pp. 81-96
Author(s):  
José Miguel Buenaposada ◽  
Luis Baumela

In recent years we have witnessed significant progress in the performance of object detection in images. This advance stems from the use of rich discriminative features produced by deep models and the adoption of new training techniques. Although these techniques have been extensively used in the mainstream deep learning-based models, it is still an open issue to analyze their impact in alternative, and computationally more efficient, ensemble-based approaches. In this paper we evaluate the impact of the adoption of data augmentation, bounding box refinement and multi-scale processing in the context of multi-class Boosting-based object detection. In our experiments we show that use of these training advancements significantly improves the object detection performance.


2020 ◽  
Vol 11 (1) ◽  
Author(s):  
Qingyu Zhao ◽  
Ehsan Adeli ◽  
Kilian M. Pohl

AbstractThe presence of confounding effects (or biases) is one of the most critical challenges in using deep learning to advance discovery in medical imaging studies. Confounders affect the relationship between input data (e.g., brain MRIs) and output variables (e.g., diagnosis). Improper modeling of those relationships often results in spurious and biased associations. Traditional machine learning and statistical models minimize the impact of confounders by, for example, matching data sets, stratifying data, or residualizing imaging measurements. Alternative strategies are needed for state-of-the-art deep learning models that use end-to-end training to automatically extract informative features from large set of images. In this article, we introduce an end-to-end approach for deriving features invariant to confounding factors while accounting for intrinsic correlations between the confounder(s) and prediction outcome. The method does so by exploiting concepts from traditional statistical methods and recent fair machine learning schemes. We evaluate the method on predicting the diagnosis of HIV solely from Magnetic Resonance Images (MRIs), identifying morphological sex differences in adolescence from those of the National Consortium on Alcohol and Neurodevelopment in Adolescence (NCANDA), and determining the bone age from X-ray images of children. The results show that our method can accurately predict while reducing biases associated with confounders. The code is available at https://github.com/qingyuzhao/br-net.


Author(s):  
Marcel Bengs ◽  
Finn Behrendt ◽  
Julia Krüger ◽  
Roland Opfer ◽  
Alexander Schlaefer

Abstract Purpose Brain Magnetic Resonance Images (MRIs) are essential for the diagnosis of neurological diseases. Recently, deep learning methods for unsupervised anomaly detection (UAD) have been proposed for the analysis of brain MRI. These methods rely on healthy brain MRIs and eliminate the requirement of pixel-wise annotated data compared to supervised deep learning. While a wide range of methods for UAD have been proposed, these methods are mostly 2D and only learn from MRI slices, disregarding that brain lesions are inherently 3D and the spatial context of MRI volumes remains unexploited. Methods We investigate whether using increased spatial context by using MRI volumes combined with spatial erasing leads to improved unsupervised anomaly segmentation performance compared to learning from slices. We evaluate and compare 2D variational autoencoder (VAE) to their 3D counterpart, propose 3D input erasing, and systemically study the impact of the data set size on the performance. Results Using two publicly available segmentation data sets for evaluation, 3D VAEs outperform their 2D counterpart, highlighting the advantage of volumetric context. Also, our 3D erasing methods allow for further performance improvements. Our best performing 3D VAE with input erasing leads to an average DICE score of 31.40% compared to 25.76% for the 2D VAE. Conclusions We propose 3D deep learning methods for UAD in brain MRI combined with 3D erasing and demonstrate that 3D methods clearly outperform their 2D counterpart for anomaly segmentation. Also, our spatial erasing method allows for further performance improvements and reduces the requirement for large data sets.


Symmetry ◽  
2021 ◽  
Vol 13 (8) ◽  
pp. 1497
Author(s):  
Harold Achicanoy ◽  
Deisy Chaves ◽  
Maria Trujillo

Deep learning applications on computer vision involve the use of large-volume and representative data to obtain state-of-the-art results due to the massive number of parameters to optimise in deep models. However, data are limited with asymmetric distributions in industrial applications due to rare cases, legal restrictions, and high image-acquisition costs. Data augmentation based on deep learning generative adversarial networks, such as StyleGAN, has arisen as a way to create training data with symmetric distributions that may improve the generalisation capability of built models. StyleGAN generates highly realistic images in a variety of domains as a data augmentation strategy but requires a large amount of data to build image generators. Thus, transfer learning in conjunction with generative models are used to build models with small datasets. However, there are no reports on the impact of pre-trained generative models, using transfer learning. In this paper, we evaluate a StyleGAN generative model with transfer learning on different application domains—training with paintings, portraits, Pokémon, bedrooms, and cats—to generate target images with different levels of content variability: bean seeds (low variability), faces of subjects between 5 and 19 years old (medium variability), and charcoal (high variability). We used the first version of StyleGAN due to the large number of publicly available pre-trained models. The Fréchet Inception Distance was used for evaluating the quality of synthetic images. We found that StyleGAN with transfer learning produced good quality images, being an alternative for generating realistic synthetic images in the evaluated domains.


2021 ◽  
pp. 1-11
Author(s):  
Sunil Rao ◽  
Vivek Narayanaswamy ◽  
Michael Esposito ◽  
Jayaraman J. Thiagarajan ◽  
Andreas Spanias

Reliable and rapid non-invasive testing has become essential for COVID-19 diagnosis and tracking statistics. Recent studies motivate the use of modern machine learning (ML) and deep learning (DL) tools that utilize features of coughing sounds for COVID-19 diagnosis. In this paper, we describe system designs that we developed for COVID-19 cough detection with the long-term objective of embedding them in a testing device. More specifically, we use log-mel spectrogram features extracted from the coughing audio signal and design a series of customized deep learning algorithms to develop fast and automated diagnosis tools for COVID-19 detection. We first explore the use of a deep neural network with fully connected layers. Additionally, we investigate prospects of efficient implementation by examining the impact on the detection performance by pruning the fully connected neural network based on the Lottery Ticket Hypothesis (LTH) optimization process. In general, pruned neural networks have been shown to provide similar performance gains to that of unpruned networks with reduced computational complexity in a variety of signal processing applications. Finally, we investigate the use of convolutional neural network architectures and in particular the VGG-13 architecture which we tune specifically for this application. Our results show that a unique ensembling of the VGG-13 architecture trained using a combination of binary cross entropy and focal losses with data augmentation significantly outperforms the fully connected networks and other recently proposed baselines on the DiCOVA 2021 COVID-19 cough audio dataset. Our customized VGG-13 model achieves an average validation AUROC of 82.23% and a test AUROC of 78.3% at a sensitivity of 80.49%.


Sign in / Sign up

Export Citation Format

Share Document