Three-stage segmentation of lung region from CT images using deep neural networks

Abstract Background Lung region segmentation is an important stage of automated image-based approaches for the diagnosis of respiratory diseases. Manual methods executed by experts are considered the gold standard, but it is time consuming and the accuracy is dependent on radiologists’ experience. Automated methods are relatively fast and reproducible with potential to facilitate physician interpretation of images. However, these benefits are possible only after overcoming several challenges. The traditional methods that are formulated as a three-stage segmentation demonstrate promising results on normal CT data but perform poorly in the presence of pathological features and variations in image quality attributes. The implementation of deep learning methods that can demonstrate superior performance over traditional methods is dependent on the quantity, quality, cost and the time it takes to generate training data. Thus, efficient and clinically relevant automated segmentation method is desired for the diagnosis of respiratory diseases. Methods We implement each of the three stages of traditional methods using deep learning methods trained on five different configurations of training data with ground truths obtained from the 3D Image Reconstruction for Comparison of Algorithm Database (3DIRCAD) and the Interstitial Lung Diseases (ILD) database. The data was augmented with the Lung Image Database Consortium (LIDC-IDRI) image collection and a realistic phantom. A convolutional neural network (CNN) at the preprocessing stage classifies the input into lung and none lung regions. The processing stage was implemented using a CNN-based U-net while the postprocessing stage utilize another U-net and CNN for contour refinement and filtering out false positives, respectively. Results The performance of the proposed method was evaluated on 1230 and 1100 CT slices from the 3DIRCAD and ILD databases. We investigate the performance of the proposed method on five configurations of training data and three configurations of the segmentation system; three-stage segmentation and three-stage segmentation without a CNN classifier and contrast enhancement, respectively. The Dice-score recorded by the proposed method range from 0.76 to 0.95. Conclusion The clinical relevance and segmentation accuracy of deep learning models can improve though deep learning-based three-stage segmentation, image quality evaluation and enhancement as well as augmenting the training data with large volume of cheap and quality training data. We propose a new and novel deep learning-based method of contour refinement.

Download Full-text

Real-Time Automated Classification of Sky Conditions Using Deep Learning and Edge Computing

Remote Sensing ◽

10.3390/rs13193859 ◽

2021 ◽

Vol 13 (19) ◽

pp. 3859

Author(s):

Joby M. Prince Czarnecki ◽

Sathishkumar Samiappan ◽

Meilun Zhou ◽

Cary Daniel McCraine ◽

Louis L. Wasson

Keyword(s):

Neural Network ◽

Deep Learning ◽

Image Quality ◽

Convolutional Neural Network ◽

Precision Agriculture ◽

Edge Computing ◽

Training Data ◽

Learning Approaches ◽

Sky Conditions

The radiometric quality of remotely sensed imagery is crucial for precision agriculture applications because estimations of plant health rely on the underlying quality. Sky conditions, and specifically shadowing from clouds, are critical determinants in the quality of images that can be obtained from low-altitude sensing platforms. In this work, we first compare common deep learning approaches to classify sky conditions with regard to cloud shadows in agricultural fields using a visible spectrum camera. We then develop an artificial-intelligence-based edge computing system to fully automate the classification process. Training data consisting of 100 oblique angle images of the sky were provided to a convolutional neural network and two deep residual neural networks (ResNet18 and ResNet34) to facilitate learning two classes, namely (1) good image quality expected, and (2) degraded image quality expected. The expectation of quality stemmed from the sky condition (i.e., density, coverage, and thickness of clouds) present at the time of the image capture. These networks were tested using a set of 13,000 images. Our results demonstrated that ResNet18 and ResNet34 classifiers produced better classification accuracy when compared to a convolutional neural network classifier. The best overall accuracy was obtained by ResNet34, which was 92% accurate, with a Kappa statistic of 0.77. These results demonstrate a low-cost solution to quality control for future autonomous farming systems that will operate without human intervention and supervision.

Download Full-text

A Deep Learning-Based Scatter Correction of Simulated X-ray Images

Electronics ◽

10.3390/electronics8090944 ◽

2019 ◽

Vol 8 (9) ◽

pp. 944 ◽

Cited By ~ 4

Author(s):

Heesin Lee ◽

Joonwhoan Lee

Keyword(s):

Deep Learning ◽

Image Quality ◽

Imaging System ◽

Scatter Correction ◽

Similarity Index ◽

Real Data ◽

Correction Method ◽

Training Data ◽

X Ray ◽

Scatter Reduction

X-ray scattering significantly limits image quality. Conventional strategies for scatter reduction based on physical equipment or measurements inevitably increase the dose to improve the image quality. In addition, scatter reduction based on a computational algorithm could take a large amount of time. We propose a deep learning-based scatter correction method, which adopts a convolutional neural network (CNN) for restoration of degraded images. Because it is hard to obtain real data from an X-ray imaging system for training the network, Monte Carlo (MC) simulation was performed to generate the training data. For simulating X-ray images of a human chest, a cone beam CT (CBCT) was designed and modeled as an example. Then, pairs of simulated images, which correspond to scattered and scatter-free images, respectively, were obtained from the model with different doses. The scatter components, calculated by taking the differences of the pairs, were used as targets to train the weight parameters of the CNN. Compared with the MC-based iterative method, the proposed one shows better results in projected images, with as much as 58.5% reduction in root-mean-square error (RMSE), and 18.1% and 3.4% increases in peak signal-to-noise ratio (PSNR) and structural similarity index measure (SSIM), on average, respectively.

Download Full-text

A Stacked Fully Convolutional Networks with Feature Alignment Framework for Multi-Label Land-cover Segmentation

Remote Sensing ◽

10.3390/rs11091051 ◽

2019 ◽

Vol 11 (9) ◽

pp. 1051 ◽

Cited By ~ 7

Author(s):

Guangming Wu ◽

Yimin Guo ◽

Xiaoya Song ◽

Zhiling Guo ◽

Haoran Zhang ◽

...

Keyword(s):

Deep Learning ◽

Land Cover ◽

Superior Performance ◽

Learning Methods ◽

Convolutional Networks ◽

Fully Convolutional Networks ◽

Optimal Feature ◽

Imaging Conditions ◽

Very High ◽

Feature Alignment

Applying deep-learning methods, especially fully convolutional networks (FCNs), has become a popular option for land-cover classification or segmentation in remote sensing. Compared with traditional solutions, these approaches have shown promising generalization capabilities and precision levels in various datasets of different scales, resolutions, and imaging conditions. To achieve superior performance, a lot of research has focused on constructing more complex or deeper networks. However, using an ensemble of different fully convolutional models to achieve better generalization and to prevent overfitting has long been ignored. In this research, we design four stacked fully convolutional networks (SFCNs), and a feature alignment framework for multi-label land-cover segmentation. The proposed feature alignment framework introduces an alignment loss of features extracted from basic models to balance their similarity and variety. Experiments on a very high resolution(VHR) image dataset with six categories of land-covers indicates that the proposed SFCNs can gain better performance when compared to existing deep learning methods. In the 2nd variant of SFCN, the optimal feature alignment gains increments of 4.2% (0.772 vs. 0.741), 6.8% (0.629 vs. 0.589), and 5.5% (0.727 vs. 0.689) for its f1-score, jaccard index, and kappa coefficient, respectively.

Download Full-text

Short-Term Forecasting of Photovoltaic Solar Power Production Using Variational Auto-Encoder Driven Deep Learning Approach

Applied Sciences ◽

10.3390/app10238400 ◽

2020 ◽

Vol 10 (23) ◽

pp. 8400 ◽

Cited By ~ 1

Author(s):

Abdelkader Dairi ◽

Fouzi Harrou ◽

Ying Sun ◽

Sofiane Khadraoui

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Solar Power ◽

Power Production ◽

Superior Performance ◽

Support Vector ◽

Learning Models ◽

Short Term ◽

Learning Methods ◽

Short Term Forecasting

The accurate modeling and forecasting of the power output of photovoltaic (PV) systems are critical to efficiently managing their integration in smart grids, delivery, and storage. This paper intends to provide efficient short-term forecasting of solar power production using Variational AutoEncoder (VAE) model. Adopting the VAE-driven deep learning model is expected to improve forecasting accuracy because of its suitable performance in time-series modeling and flexible nonlinear approximation. Both single- and multi-step-ahead forecasts are investigated in this work. Data from two grid-connected plants (a 243 kW parking lot canopy array in the US and a 9 MW PV system in Algeria) are employed to show the investigated deep learning models’ performance. Specifically, the forecasting outputs of the proposed VAE-based forecasting method have been compared with seven deep learning methods, namely recurrent neural network, Long short-term memory (LSTM), Bidirectional LSTM, Convolutional LSTM network, Gated recurrent units, stacked autoencoder, and restricted Boltzmann machine, and two commonly used machine learning methods, namely logistic regression and support vector regression. The results of this investigation demonstrate the satisfying performance of deep learning techniques to forecast solar power and point out that the VAE consistently performed better than the other methods. Also, results confirmed the superior performance of deep learning models compared to the two considered baseline machine learning models.

Download Full-text

Dataset Augmentation Allows Deep Learning-Based Virtual Screening To Better Generalize To Unseen Target Classes, And Highlight Important Binding Interactions

10.1101/2020.03.06.979625 ◽

2020 ◽

Author(s):

Jack Scantlebury ◽

Nathan Brown ◽

Frank Von Delft ◽

Charlotte M. Deane

Keyword(s):

Deep Learning ◽

Protein Structure ◽

Virtual Screening ◽

Ligand Binding ◽

Training Data ◽

Learning Methods ◽

Simple Method ◽

Physical Interactions ◽

Account Information ◽

Protein Models

AbstractCurrent deep learning methods for structure-based virtual screening take the structures of both the protein and the ligand as input but make little or no use of the protein structure when predicting ligand binding. Here we show how a relatively simple method of dataset augmentation forces such deep learning methods to take into account information from the protein. Models trained in this way are more generalisable (make better predictions on protein-ligand complexes from a different distribution to the training data). They also assign more meaningful importance to the protein and ligand atoms involved in binding. Overall, our results show that dataset augmentation can help deep learning based virtual screening to learn physical interactions rather than dataset biases.Graphical TOC Entry

Download Full-text

Robustness of convolutional neural networks to physiological electrocardiogram noise

Philosophical Transactions of The Royal Society A Mathematical Physical and Engineering Sciences ◽

10.1098/rsta.2020.0262 ◽

2021 ◽

Vol 379 (2212) ◽

Cited By ~ 1

Author(s):

J. Venton ◽

P. M. Harris ◽

A. Sundar ◽

N. A. S. Smith ◽

P. J. Aston

Keyword(s):

Deep Learning ◽

Careful Consideration ◽

Training Data ◽

Learning Methods ◽

Ecg Signals ◽

Open Questions ◽

Image Transforms ◽

Challenges And Opportunities ◽

Electrocardiogram Ecg ◽

Image Transformations

The electrocardiogram (ECG) is a widespread diagnostic tool in healthcare and supports the diagnosis of cardiovascular disorders. Deep learning methods are a successful and popular technique to detect indications of disorders from an ECG signal. However, there are open questions around the robustness of these methods to various factors, including physiological ECG noise. In this study, we generate clean and noisy versions of an ECG dataset before applying symmetric projection attractor reconstruction (SPAR) and scalogram image transformations. A convolutional neural network is used to classify these image transforms. For the clean ECG dataset, F1 scores for SPAR attractor and scalogram transforms were 0.70 and 0.79, respectively. Scores decreased by less than 0.05 for the noisy ECG datasets. Notably, when the network trained on clean data was used to classify the noisy datasets, performance decreases of up to 0.18 in F1 scores were seen. However, when the network trained on the noisy data was used to classify the clean dataset, the decrease was less than 0.05. We conclude that physiological ECG noise impacts classification using deep learning methods and careful consideration should be given to the inclusion of noisy ECG signals in the training data when developing supervised networks for ECG classification. This article is part of the theme issue ‘Advanced computation in cardiovascular physiology: new challenges and opportunities’.

Download Full-text

ROOFN3D: DEEP LEARNING TRAINING DATA FOR 3D BUILDING RECONSTRUCTION

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xlii-2-1191-2018 ◽

2018 ◽

Vol XLII-2 ◽

pp. 1191-1198 ◽

Cited By ~ 5

Author(s):

A. Wichmann ◽

A. Agoub ◽

M. Kada

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Point Cloud ◽

Geometric Model ◽

Training Data ◽

Computer Hardware ◽

Training Dataset ◽

3D Point Cloud ◽

Learning Methods ◽

Building Reconstruction

Machine learning methods have gained in importance through the latest development of artificial intelligence and computer hardware. Particularly approaches based on deep learning have shown that they are able to provide state-of-the-art results for various tasks. However, the direct application of deep learning methods to improve the results of 3D building reconstruction is often not possible due, for example, to the lack of suitable training data. To address this issue, we present RoofN3D which provides a new 3D point cloud training dataset that can be used to train machine learning models for different tasks in the context of 3D building reconstruction. It can be used, among others, to train semantic segmentation networks or to learn the structure of buildings and the geometric model construction. Further details about RoofN3D and the developed data preparation framework, which enables the automatic derivation of training data, are described in this paper. Furthermore, we provide an overview of other available 3D point cloud training data and approaches from current literature in which solutions for the application of deep learning to unstructured and not gridded 3D point cloud data are presented.

Download Full-text

Text Separation From Document Images

Machine Learning and Deep Learning in Real-Time Applications - Advances in Computer and Electrical Engineering ◽

10.4018/978-1-7998-3095-5.ch013 ◽

2020 ◽

pp. 283-313

Author(s):

Priti P. Rege ◽

Shaheera Akhter

Keyword(s):

Deep Learning ◽

Character Recognition ◽

Optical Character Recognition ◽

Semantic Segmentation ◽

Document Image ◽

Training Data ◽

Document Images ◽

Learning Techniques ◽

Extraction Processes ◽

Segmentation Image

Text separation in document image analysis is an important preprocessing step before executing an optical character recognition (OCR) task. It is necessary to improve the accuracy of an OCR system. Traditionally, for separating text from a document, different feature extraction processes have been used that require handcrafting of the features. However, deep learning-based methods are excellent feature extractors that learn features from the training data automatically. Deep learning gives state-of-the-art results on various computer vision, image classification, segmentation, image captioning, object detection, and recognition tasks. This chapter compares various traditional as well as deep-learning techniques and uses a semantic segmentation method for separating text from Devanagari document images using U-Net and ResU-Net models. These models are further fine-tuned for transfer learning to get more precise results. The final results show that deep learning methods give more accurate results compared with conventional methods of image processing for Devanagari text extraction.

Download Full-text

Review of Deep Learning Methods in Robotic Grasp Detection

Multimodal Technologies and Interaction ◽

10.3390/mti2030057 ◽

2018 ◽

Vol 2 (3) ◽

pp. 57 ◽

Cited By ~ 40

Author(s):

Shehan Caldera ◽

Alexander Rassau ◽

Douglas Chai

Keyword(s):

Deep Learning ◽

Language Processing ◽

General Purpose ◽

Training Data ◽

Learning Approaches ◽

Automated Driving ◽

Learning Methods ◽

Robotic Vision ◽

Object A ◽

Robotic Grasp

For robots to attain more general-purpose utility, grasping is a necessary skill to master. Such general-purpose robots may use their perception abilities to visually identify grasps for a given object. A grasp describes how a robotic end-effector can be arranged to securely grab an object and successfully lift it without slippage. Traditionally, grasp detection requires expert human knowledge to analytically form the task-specific algorithm, but this is an arduous and time-consuming approach. During the last five years, deep learning methods have enabled significant advancements in robotic vision, natural language processing, and automated driving applications. The successful results of these methods have driven robotics researchers to explore the use of deep learning methods in task-generalised robotic applications. This paper reviews the current state-of-the-art in regards to the application of deep learning methods to generalised robotic grasping and discusses how each element of the deep learning approach has improved the overall performance of robotic grasp detection. Several of the most promising approaches are evaluated and the most suitable for real-time grasp detection is identified as the one-shot detection method. The availability of suitable volumes of appropriate training data is identified as a major obstacle for effective utilisation of the deep learning approaches, and the use of transfer learning techniques is proposed as a potential mechanism to address this. Finally, current trends in the field and future potential research directions are discussed.

Download Full-text

A two-stream convolutional neural network for microRNA transcription start site feature integration and identification

Scientific Reports ◽

10.1038/s41598-021-85173-x ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Mingyu Cha ◽

Hansi Zheng ◽

Amlan Talukder ◽

Clayton Barham ◽

Xiaoman Li ◽

...

Keyword(s):

Gene Regulation ◽

Deep Learning ◽

High Resolution ◽

Computational Models ◽

Mirna Gene ◽

Computational Prediction ◽

Training Data ◽

Superior Performance ◽

Research Projects ◽

Transcription Start

AbstractMicroRNAs (miRNAs) play important roles in post-transcriptional gene regulation and phenotype development. Understanding the regulation of miRNA genes is critical to understand gene regulation. One of the challenges to study miRNA gene regulation is the lack of condition-specific annotation of miRNA transcription start sites (TSSs). Unlike protein-coding genes, miRNA TSSs can be tens of thousands of nucleotides away from the precursor miRNAs and they are hard to be detected by conventional RNA-Seq experiments. A number of studies have been attempted to computationally predict miRNA TSSs. However, high-resolution condition-specific miRNA TSS prediction remains a challenging problem. Recently, deep learning models have been successfully applied to various bioinformatics problems but have not been effectively created for condition-specific miRNA TSS prediction. Here we created a two-stream deep learning model called D-miRT for computational prediction of condition-specific miRNA TSSs (http://hulab.ucf.edu/research/projects/DmiRT/). D-miRT is a natural fit for the integration of low-resolution epigenetic features (DNase-Seq and histone modification data) and high-resolution sequence features. Compared with alternative computational models on different sets of training data, D-miRT outperformed all baseline models and demonstrated high accuracy for condition-specific miRNA TSS prediction tasks. Comparing with the most recent approaches on cell-specific miRNA TSS identification using cell lines that were unseen to the model training processes, D-miRT also showed superior performance.

Download Full-text