A Novel Object-Based Deep Learning Framework for Semantic Segmentation of Very High-Resolution Remote Sensing Data: Comparison with Convolutional and Fully Convolutional Networks

Deep learning architectures have received much attention in recent years demonstrating state-of-the-art performance in several segmentation, classification and other computer vision tasks. Most of these deep networks are based on either convolutional or fully convolutional architectures. In this paper, we propose a novel object-based deep-learning framework for semantic segmentation in very high-resolution satellite data. In particular, we exploit object-based priors integrated into a fully convolutional neural network by incorporating an anisotropic diffusion data preprocessing step and an additional loss term during the training process. Under this constrained framework, the goal is to enforce pixels that belong to the same object to be classified at the same semantic category. We compared thoroughly the novel object-based framework with the currently dominating convolutional and fully convolutional deep networks. In particular, numerous experiments were conducted on the publicly available ISPRS WGII/4 benchmark datasets, namely Vaihingen and Potsdam, for validation and inter-comparison based on a variety of metrics. Quantitatively, experimental results indicate that, overall, the proposed object-based framework slightly outperformed the current state-of-the-art fully convolutional networks by more than 1% in terms of overall accuracy, while intersection over union results are improved for all semantic categories. Qualitatively, man-made classes with more strict geometry such as buildings were the ones that benefit most from our method, especially along object boundaries, highlighting the great potential of the developed approach.

Download Full-text

Automatic Deep Learning Semantic Segmentation of Ultrasound Thyroid Cineclips using Recurrent Fully Convolutional Networks

IEEE Access ◽

10.1109/access.2020.3045906 ◽

2020 ◽

pp. 1-1

Author(s):

Jeremy M. Webb ◽

Duane D. Meixner ◽

Shaheeda A. Adusei ◽

Eric C. Polley ◽

Mostafa Fatemi ◽

...

Keyword(s):

Deep Learning ◽

Semantic Segmentation ◽

Convolutional Networks ◽

Fully Convolutional Networks

Download Full-text

A Multi-Task Deep Learning Framework Coupling Semantic Segmentation and Image Reconstruction for Very High Resolution Imagery

IGARSS 2019 - 2019 IEEE International Geoscience and Remote Sensing Symposium ◽

10.1109/igarss.2019.8898133 ◽

2019 ◽

Author(s):

Maria Papadomanolaki ◽

Konstantinos Karantzalos ◽

Maria Vakalopoulou

Keyword(s):

Deep Learning ◽

High Resolution ◽

Image Reconstruction ◽

Semantic Segmentation ◽

Learning Framework ◽

High Resolution Imagery ◽

Very High Resolution Imagery ◽

Very High

Download Full-text

Fully Convolutional Networks for Semantic Segmentation of Very High Resolution Remotely Sensed Images Combined With DSM

IEEE Geoscience and Remote Sensing Letters ◽

10.1109/lgrs.2018.2795531 ◽

2018 ◽

Vol 15 (3) ◽

pp. 474-478 ◽

Cited By ~ 51

Author(s):

Weiwei Sun ◽

Ruisheng Wang

Keyword(s):

High Resolution ◽

Semantic Segmentation ◽

Remotely Sensed ◽

Convolutional Networks ◽

Fully Convolutional Networks ◽

Remotely Sensed Images ◽

Very High

Download Full-text

A Stacked Fully Convolutional Networks with Feature Alignment Framework for Multi-Label Land-cover Segmentation

Remote Sensing ◽

10.3390/rs11091051 ◽

2019 ◽

Vol 11 (9) ◽

pp. 1051 ◽

Cited By ~ 7

Author(s):

Guangming Wu ◽

Yimin Guo ◽

Xiaoya Song ◽

Zhiling Guo ◽

Haoran Zhang ◽

...

Keyword(s):

Deep Learning ◽

Land Cover ◽

Superior Performance ◽

Learning Methods ◽

Convolutional Networks ◽

Fully Convolutional Networks ◽

Optimal Feature ◽

Imaging Conditions ◽

Very High ◽

Feature Alignment

Applying deep-learning methods, especially fully convolutional networks (FCNs), has become a popular option for land-cover classification or segmentation in remote sensing. Compared with traditional solutions, these approaches have shown promising generalization capabilities and precision levels in various datasets of different scales, resolutions, and imaging conditions. To achieve superior performance, a lot of research has focused on constructing more complex or deeper networks. However, using an ensemble of different fully convolutional models to achieve better generalization and to prevent overfitting has long been ignored. In this research, we design four stacked fully convolutional networks (SFCNs), and a feature alignment framework for multi-label land-cover segmentation. The proposed feature alignment framework introduces an alignment loss of features extracted from basic models to balance their similarity and variety. Experiments on a very high resolution(VHR) image dataset with six categories of land-covers indicates that the proposed SFCNs can gain better performance when compared to existing deep learning methods. In the 2nd variant of SFCN, the optimal feature alignment gains increments of 4.2% (0.772 vs. 0.741), 6.8% (0.629 vs. 0.589), and 5.5% (0.727 vs. 0.689) for its f1-score, jaccard index, and kappa coefficient, respectively.

Download Full-text

Image-Based Plant Disease Identification by Deep Learning Meta-Architectures

Plants ◽

10.3390/plants9111451 ◽

2020 ◽

Vol 9 (11) ◽

pp. 1451

Author(s):

Muhammad Hammad Saleem ◽

Sapna Khanchi ◽

Johan Potgieter ◽

Khalid Mahmood Arif

Keyword(s):

Deep Learning ◽

Plant Disease ◽

State Of The Art ◽

Mean Average Precision ◽

Single Shot ◽

Average Precision ◽

Crop Monitoring ◽

Convolutional Networks ◽

Fully Convolutional Networks ◽

Agricultural Applications

The identification of plant disease is an imperative part of crop monitoring systems. Computer vision and deep learning (DL) techniques have been proven to be state-of-the-art to address various agricultural problems. This research performed the complex tasks of localization and classification of the disease in plant leaves. In this regard, three DL meta-architectures including the Single Shot MultiBox Detector (SSD), Faster Region-based Convolutional Neural Network (RCNN), and Region-based Fully Convolutional Networks (RFCN) were applied by using the TensorFlow object detection framework. All the DL models were trained/tested on a controlled environment dataset to recognize the disease in plant species. Moreover, an improvement in the mean average precision of the best-obtained deep learning architecture was attempted through different state-of-the-art deep learning optimizers. The SSD model trained with an Adam optimizer exhibited the highest mean average precision (mAP) of 73.07%. The successful identification of 26 different types of defected and 12 types of healthy leaves in a single framework proved the novelty of the work. In the future, the proposed detection methodology can also be adopted for other agricultural applications. Moreover, the generated weights can be reused for future real-time detection of plant disease in a controlled/uncontrolled environment.

Download Full-text

Fully Convolutional Networks and Geographic Object-Based Image Analysis for the Classification of VHR Imagery

Remote Sensing ◽

10.3390/rs11050597 ◽

2019 ◽

Vol 11 (5) ◽

pp. 597 ◽

Cited By ~ 21

Author(s):

Nicholus Mboga ◽

Stefanos Georganos ◽

Tais Grippa ◽

Moritz Lennert ◽

Sabine Vanhuysse ◽

...

Keyword(s):

Democratic Republic Of Congo ◽

State Of The Art ◽

Computational Cost ◽

Convolutional Networks ◽

Object Based Image Analysis ◽

Fully Convolutional Networks ◽

Object Based ◽

Geographic Object ◽

End To End ◽

Future Work

Land cover Classified maps obtained from deep learning methods such as Convolutional neural networks (CNNs) and fully convolutional networks (FCNs) usually have high classification accuracy but with the detailed structures of objects lost or smoothed. In this work, we develop a methodology based on fully convolutional networks (FCN) that is trained in an end-to-end fashion using aerial RGB images only as input. Skip connections are introduced into the FCN architecture to recover high spatial details from the lower convolutional layers. The experiments are conducted on the city of Goma in the Democratic Republic of Congo. We compare the results to a state-of-the art approach based on a semi-automatic Geographic object image-based analysis (GEOBIA) processing chain. State-of-the art classification accuracies are obtained by both methods whereby FCN and the best baseline method have an overall accuracy of 91.3% and 89.5% respectively. The maps have good visual quality and the use of an FCN skip architecture minimizes the rounded edges that is characteristic of FCN maps. Additional experiments are done to refine FCN classified maps using segments obtained from GEOBIA generated at different scale and minimum segment size. High OA of up to 91.5% is achieved accompanied with an improved edge delineation in the FCN maps, and future work will involve explicitly incorporating boundary information from the GEOBIA segmentation into the FCN pipeline in an end-to-end fashion. Finally, we observe that FCN has a lower computational cost than the standard patch-based CNN approach especially at inference.

Download Full-text

Deep Residual Autoencoder with Multiscaling for Semantic Segmentation of Land-Use Images

Remote Sensing ◽

10.3390/rs11182142 ◽

2019 ◽

Vol 11 (18) ◽

pp. 2142 ◽

Cited By ~ 5

Author(s):

Lianfa Li

Keyword(s):

Land Use ◽

Deep Learning ◽

Semantic Segmentation ◽

Remotely Sensed ◽

Convolutional Network ◽

Convolutional Networks ◽

Residual Learning ◽

Fully Convolutional Networks ◽

Remotely Sensed Images ◽

Real World Datasets

Semantic segmentation is a fundamental means of extracting information from remotely sensed images at the pixel level. Deep learning has enabled considerable improvements in efficiency and accuracy of semantic segmentation of general images. Typical models range from benchmarks such as fully convolutional networks, U-Net, Micro-Net, and dilated residual networks to the more recently developed DeepLab 3+. However, many of these models were originally developed for segmentation of general or medical images and videos, and are not directly relevant to remotely sensed images. The studies of deep learning for semantic segmentation of remotely sensed images are limited. This paper presents a novel flexible autoencoder-based architecture of deep learning that makes extensive use of residual learning and multiscaling for robust semantic segmentation of remotely sensed land-use images. In this architecture, a deep residual autoencoder is generalized to a fully convolutional network in which residual connections are implemented within and between all encoding and decoding layers. Compared with the concatenated shortcuts in U-Net, these residual connections reduce the number of trainable parameters and improve the learning efficiency by enabling extensive backpropagation of errors. In addition, resizing or atrous spatial pyramid pooling (ASPP) can be leveraged to capture multiscale information from the input images to enhance the robustness to scale variations. The residual learning and multiscaling strategies improve the trained model’s generalizability, as demonstrated in the semantic segmentation of land-use types in two real-world datasets of remotely sensed images. Compared with U-Net, the proposed method improves the Jaccard index (JI) or the mean intersection over union (MIoU) by 4-11% in the training phase and by 3-9% in the validation and testing phases. With its flexible deep learning architecture, the proposed approach can be easily applied for and transferred to semantic segmentation of land-use variables and other surface variables of remotely sensed images.

Download Full-text

Semantic segmentation of very high resolution remote sensing images with residual logic deep fully convolutional networks

MIPPR 2019: Remote Sensing Image Processing, Geographic Information Systems, and Other Applications ◽

10.1117/12.2541818 ◽

2020 ◽

Author(s):

Sheng He ◽

Jin Liu

Keyword(s):

Remote Sensing ◽

High Resolution ◽

Semantic Segmentation ◽

Remote Sensing Images ◽

Convolutional Networks ◽

Fully Convolutional Networks ◽

Very High

Download Full-text

Depth Density Achieves a Better Result for Semantic Segmentation with the Kinect System

Sensors ◽

10.3390/s20030812 ◽

2020 ◽

Vol 20 (3) ◽

pp. 812 ◽

Cited By ~ 1

Author(s):

Hanbing Deng ◽

Tongyu Xu ◽

Yuncheng Zhou ◽

Teng Miao

Keyword(s):

Neural Networks ◽

Image Segmentation ◽

Deep Learning ◽

Semantic Segmentation ◽

Depth Image ◽

Depth Information ◽

Convolutional Networks ◽

Automatic Feature Extraction ◽

Fully Convolutional Networks ◽

Simmental Cattle

Image segmentation is one of the most important methods for animal phenome research. Since the advent of deep learning, many researchers have looked at multilayer convolutional neural networks to solve the problems of image segmentation. A network simplifies the task of image segmentation with automatic feature extraction. Many networks struggle to output accurate details when dealing with pixel-level segmentation. In this paper, we propose a new concept: Depth density. Based on a depth image, produced by a Kinect system, we design a new function to calculate the depth density value of each pixel and bring this value back to the result of semantic segmentation for improving the accuracy. In the experiment, we choose Simmental cattle as the target of image segmentation and fully convolutional networks (FCN) as the verification networks. We proved that depth density can improve four metrics of semantic segmentation (pixel accuracy, mean accuracy, mean intersection over union, and frequency weight intersection over union) by 2.9%, 0.3%, 11.4%, and 5.02%, respectively. The result shows that depth information produced by Kinect can improve the accuracy of the semantic segmentation of FCN. This provides a new way of analyzing the phenotype information of animals.

Download Full-text