scholarly journals On the objectivity, reliability, and validity of deep learning enabled bioimage analyses

eLife ◽  
2020 ◽  
Vol 9 ◽  
Author(s):  
Dennis Segebarth ◽  
Matthias Griebel ◽  
Nikolai Stein ◽  
Cora R von Collenberg ◽  
Corinna Martin ◽  
...  

Bioimage analysis of fluorescent labels is widely used in the life sciences. Recent advances in deep learning (DL) allow automating time-consuming manual image analysis processes based on annotated training data. However, manual annotation of fluorescent features with a low signal-to-noise ratio is somewhat subjective. Training DL models on subjective annotations may be instable or yield biased models. In turn, these models may be unable to reliably detect biological effects. An analysis pipeline integrating data annotation, ground truth estimation, and model training can mitigate this risk. To evaluate this integrated process, we compared different DL-based analysis approaches. With data from two model organisms (mice, zebrafish) and five laboratories, we show that ground truth estimation from multiple human annotators helps to establish objectivity in fluorescent feature annotations. Furthermore, ensembles of multiple models trained on the estimated ground truth establish reliability and validity. Our research provides guidelines for reproducible DL-based bioimage analyses.

2018 ◽  
Author(s):  
Dennis Segebarth ◽  
Matthias Griebel ◽  
Nikolai Stein ◽  
Cora R. von Collenberg ◽  
Corinna Martin ◽  
...  

AbstractFluorescent labeling of biomolecules is widely used for bioimage analyses throughout the life sciences. Recent advances in deep learning (DL) have opened new possibilities to scale the image analysis processes through automation. However, the annotation of fluorescent features with a low signal-to-noise ratio is frequently based on subjective criteria. Training on subjective annotations may ultimately lead to biased DL models yielding irreproducible results. An end-to-end analysis process that integrates data annotation, ground truth estimation, and model training can mitigate this risk. To highlight the importance of this integrated process, we compare different DL-based analysis approaches. Based on data from different laboratories, we show that ground truth estimation from multiple human annotators is indispensable to establish objectivity in fluorescent feature annotations. We demonstrate that ensembles of multiple models trained on the estimated ground truth establish reliability and validity. Our research provides guidelines for reproducible and transparent bioimage analyses using DL methods.


2021 ◽  
Vol 10 (1) ◽  
Author(s):  
Xinyang Li ◽  
Guoxun Zhang ◽  
Hui Qiao ◽  
Feng Bao ◽  
Yue Deng ◽  
...  

AbstractThe development of deep learning and open access to a substantial collection of imaging data together provide a potential solution for computational image transformation, which is gradually changing the landscape of optical imaging and biomedical research. However, current implementations of deep learning usually operate in a supervised manner, and their reliance on laborious and error-prone data annotation procedures remains a barrier to more general applicability. Here, we propose an unsupervised image transformation to facilitate the utilization of deep learning for optical microscopy, even in some cases in which supervised models cannot be applied. Through the introduction of a saliency constraint, the unsupervised model, named Unsupervised content-preserving Transformation for Optical Microscopy (UTOM), can learn the mapping between two image domains without requiring paired training data while avoiding distortions of the image content. UTOM shows promising performance in a wide range of biomedical image transformation tasks, including in silico histological staining, fluorescence image restoration, and virtual fluorescence labeling. Quantitative evaluations reveal that UTOM achieves stable and high-fidelity image transformations across different imaging conditions and modalities. We anticipate that our framework will encourage a paradigm shift in training neural networks and enable more applications of artificial intelligence in biomedical imaging.


2019 ◽  
Vol 38 (11) ◽  
pp. 872a1-872a9 ◽  
Author(s):  
Mauricio Araya-Polo ◽  
Stuart Farris ◽  
Manuel Florez

Exploration seismic data are heavily manipulated before human interpreters are able to extract meaningful information regarding subsurface structures. This manipulation adds modeling and human biases and is limited by methodological shortcomings. Alternatively, using seismic data directly is becoming possible thanks to deep learning (DL) techniques. A DL-based workflow is introduced that uses analog velocity models and realistic raw seismic waveforms as input and produces subsurface velocity models as output. When insufficient data are used for training, DL algorithms tend to overfit or fail. Gathering large amounts of labeled and standardized seismic data sets is not straightforward. This shortage of quality data is addressed by building a generative adversarial network (GAN) to augment the original training data set, which is then used by DL-driven seismic tomography as input. The DL tomographic operator predicts velocity models with high statistical and structural accuracy after being trained with GAN-generated velocity models. Beyond the field of exploration geophysics, the use of machine learning in earth science is challenged by the lack of labeled data or properly interpreted ground truth, since we seldom know what truly exists beneath the earth's surface. The unsupervised approach (using GANs to generate labeled data)illustrates a way to mitigate this problem and opens geology, geophysics, and planetary sciences to more DL applications.


2020 ◽  
Vol 36 (12) ◽  
pp. 3863-3870
Author(s):  
Mischa Schwendy ◽  
Ronald E Unger ◽  
Sapun H Parekh

Abstract Motivation Deep learning use for quantitative image analysis is exponentially increasing. However, training accurate, widely deployable deep learning algorithms requires a plethora of annotated (ground truth) data. Image collections must contain not only thousands of images to provide sufficient example objects (i.e. cells), but also contain an adequate degree of image heterogeneity. Results We present a new dataset, EVICAN—Expert visual cell annotation, comprising partially annotated grayscale images of 30 different cell lines from multiple microscopes, contrast mechanisms and magnifications that is readily usable as training data for computer vision applications. With 4600 images and ∼26 000 segmented cells, our collection offers an unparalleled heterogeneous training dataset for cell biology deep learning application development. Availability and implementation The dataset is freely available (https://edmond.mpdl.mpg.de/imeji/collection/l45s16atmi6Aa4sI?q=). Using a Mask R-CNN implementation, we demonstrate automated segmentation of cells and nuclei from brightfield images with a mean average precision of 61.6 % at a Jaccard Index above 0.5.


Microscopy ◽  
2021 ◽  
Author(s):  
Kohki Konishi ◽  
Takao Nonaka ◽  
Shunsuke Takei ◽  
Keisuke Ohta ◽  
Hideo Nishioka ◽  
...  

Abstract Three-dimensional (3D) observation of a biological sample using serial-section electron microscopy is widely used. However, organelle segmentation requires a significant amount of manual time. Therefore, several studies have been conducted to improve their efficiency. One such promising method is 3D deep learning (DL), which is highly accurate. However, the creation of training data for 3D DL still requires manual time and effort. In this study, we developed a highly efficient integrated image segmentation tool that includes stepwise DL with manual correction. The tool has four functions: efficient tracers for annotation, model training/inference for organelle segmentation using a lightweight convolutional neural network, efficient proofreading, and model refinement. We applied this tool to increase the training data step by step (stepwise annotation method) to segment the mitochondria in the cells of the cerebral cortex. We found that the stepwise annotation method reduced the manual operation time by one-third compared with that of the fully manual method, where all the training data were created manually. Moreover, we demonstrated that the F1 score, the metric of segmentation accuracy, was 0.9 by training the 3D DL model with these training data. The stepwise annotation method using this tool and the 3D DL model improved the segmentation efficiency for various organelles.


2020 ◽  
Author(s):  
Wim Wiegerinck

<p>Deep learning is a modeling approach that has shown impressive results in image processing and is arguably a promising tool for dealing with spatially extended complex systems such earth atmosphere with its visually interpretable patterns. A disadvantage of the neural network approach is that it typically requires an enormous amount of training data.</p><p> </p><p>Another recently proposed modeling approach is supermodeling. In supermodeling it is assumed that a dynamical system – the truth – is modelled by a set of good but imperfect models. The idea is to improve model performance by dynamically combining imperfect models during the simulation. The resulting combination of models is called the supermodel. The combination strength has to be learned from data. However, since supermodels do not start from scratch, but make use of existing domain knowledge, they may learn from less data.</p><p> </p><p>One of the ways to combine models is to define the tendencies of the supermodel as linear (weighted) combinations of the imperfect model tendencies. Several methods including linear regression have been proposed to optimize the weights.  However, the combination method might also be nonlinear. In this work we propose and explore a novel combination of deep learning and supermodeling, in which convolutional neural networks are used as tool to combine the predictions of the imperfect models.  The different supermodeling strategies are applied in simulations in a controlled environment with a three-level, quasi-geostrophic spectral model that serves as ground truth and perturbed models that serve as the imperfect models.</p>


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Avantika Lal ◽  
Zachary D. Chiang ◽  
Nikolai Yakovenko ◽  
Fabiana M. Duarte ◽  
Johnny Israeli ◽  
...  

AbstractATAC-seq is a widely-applied assay used to measure genome-wide chromatin accessibility; however, its ability to detect active regulatory regions can depend on the depth of sequencing coverage and the signal-to-noise ratio. Here we introduce AtacWorks, a deep learning toolkit to denoise sequencing coverage and identify regulatory peaks at base-pair resolution from low cell count, low-coverage, or low-quality ATAC-seq data. Models trained by AtacWorks can detect peaks from cell types not seen in the training data, and are generalizable across diverse sample preparations and experimental platforms. We demonstrate that AtacWorks enhances the sensitivity of single-cell experiments by producing results on par with those of conventional methods using ~10 times as many cells, and further show that this framework can be adapted to enable cross-modality inference of protein-DNA interactions. Finally, we establish that AtacWorks can enable new biological discoveries by identifying active regulatory regions associated with lineage priming in rare subpopulations of hematopoietic stem cells.


Electronics ◽  
2020 ◽  
Vol 9 (1) ◽  
pp. 135 ◽  
Author(s):  
Siti Nurmaini ◽  
Annisa Darmawahyuni ◽  
Akhmad Noviar Sakti Mukti ◽  
Muhammad Naufal Rachmatullah ◽  
Firdaus Firdaus ◽  
...  

The electrocardiogram (ECG) is a widely used, noninvasive test for analyzing arrhythmia. However, the ECG signal is prone to contamination by different kinds of noise. Such noise may cause deformation on the ECG heartbeat waveform, leading to cardiologists’ mislabeling or misinterpreting heartbeats due to varying types of artifacts and interference. To address this problem, some previous studies propose a computerized technique based on machine learning (ML) to distinguish between normal and abnormal heartbeats. Unfortunately, ML works on a handcrafted, feature-based approach and lacks feature representation. To overcome such drawbacks, deep learning (DL) is proposed in the pre-training and fine-tuning phases to produce an automated feature representation for multi-class classification of arrhythmia conditions. In the pre-training phase, stacked denoising autoencoders (DAEs) and autoencoders (AEs) are used for feature learning; in the fine-tuning phase, deep neural networks (DNNs) are implemented as a classifier. To the best of our knowledge, this research is the first to implement stacked autoencoders by using DAEs and AEs for feature learning in DL. Physionet’s well-known MIT-BIH Arrhythmia Database, as well as the MIT-BIH Noise Stress Test Database (NSTDB). Only four records are used from the NSTDB dataset: 118 24 dB, 118 −6 dB, 119 24 dB, and 119 −6 dB, with two levels of signal-to-noise ratio (SNRs) at 24 dB and −6 dB. In the validation process, six models are compared to select the best DL model. For all fine-tuned hyperparameters, the best model of ECG heartbeat classification achieves an accuracy, sensitivity, specificity, precision, and F1-score of 99.34%, 93.83%, 99.57%, 89.81%, and 91.44%, respectively. As the results demonstrate, the proposed DL model can extract high-level features not only from the training data but also from unseen data. Such a model has good application prospects in clinical practice.


2021 ◽  
Vol 7 (3) ◽  
pp. 44
Author(s):  
Johannes Leuschner ◽  
Maximilian Schmidt ◽  
Poulami Somanya Ganguly ◽  
Vladyslav Andriiashen ◽  
Sophia Bethany Coban ◽  
...  

The reconstruction of computed tomography (CT) images is an active area of research. Following the rise of deep learning methods, many data-driven models have been proposed in recent years. In this work, we present the results of a data challenge that we organized, bringing together algorithm experts from different institutes to jointly work on quantitative evaluation of several data-driven methods on two large, public datasets during a ten day sprint. We focus on two applications of CT, namely, low-dose CT and sparse-angle CT. This enables us to fairly compare different methods using standardized settings. As a general result, we observe that the deep learning-based methods are able to improve the reconstruction quality metrics in both CT applications while the top performing methods show only minor differences in terms of peak signal-to-noise ratio (PSNR) and structural similarity (SSIM). We further discuss a number of other important criteria that should be taken into account when selecting a method, such as the availability of training data, the knowledge of the physical measurement model and the reconstruction speed.


2022 ◽  
Vol 14 (2) ◽  
pp. 263
Author(s):  
Haixia Zhao ◽  
Tingting Bai ◽  
Zhiqiang Wang

Seismic field data are usually contaminated by random or complex noise, which seriously affect the quality of seismic data contaminating seismic imaging and seismic interpretation. Improving the signal-to-noise ratio (SNR) of seismic data has always been a key step in seismic data processing. Deep learning approaches have been successfully applied to suppress seismic random noise. The training examples are essential in deep learning methods, especially for the geophysical problems, where the complete training data are not easy to be acquired due to high cost of acquisition. In this work, we propose a natural images pre-trained deep learning method to suppress seismic random noise through insight of the transfer learning. Our network contains pre-trained and post-trained networks: the former is trained by natural images to obtain the preliminary denoising results, while the latter is trained by a small amount of seismic images to fine-tune the denoising effects by semi-supervised learning to enhance the continuity of geological structures. The results of four types of synthetic seismic data and six field data demonstrate that our network has great performance in seismic random noise suppression in terms of both quantitative metrics and intuitive effects.


Sign in / Sign up

Export Citation Format

Share Document