Fig Plant Segmentation from Aerial Images Using a Deep Convolutional Encoder-Decoder Network

Crop segmentation is an important task in Precision Agriculture, where the use of aerial robots with an on-board camera has contributed to the development of new solution alternatives. We address the problem of fig plant segmentation in top-view RGB (Red-Green-Blue) images of a crop grown under open-field difficult circumstances of complex lighting conditions and non-ideal crop maintenance practices defined by local farmers. We present a Convolutional Neural Network (CNN) with an encoder-decoder architecture that classifies each pixel as crop or non-crop using only raw colour images as input. Our approach achieves a mean accuracy of 93.85% despite the complexity of the background and a highly variable visual appearance of the leaves. We make available our CNN code to the research community, as well as the aerial image data set and a hand-made ground truth segmentation with pixel precision to facilitate the comparison among different algorithms.

Download Full-text

A Multi-Task Network with Distance–Mask–Boundary Consistency Constraints for Building Extraction from Aerial Images

Remote Sensing ◽

10.3390/rs13142656 ◽

2021 ◽

Vol 13 (14) ◽

pp. 2656

Author(s):

Furong Shi ◽

Tong Zhang

Keyword(s):

Distance Estimation ◽

Image Data ◽

Learning Technologies ◽

Aerial Images ◽

Superior Performance ◽

Aerial Image ◽

Great Success ◽

Building Extraction ◽

Shape Information ◽

Multi Scale

Deep-learning technologies, especially convolutional neural networks (CNNs), have achieved great success in building extraction from areal images. However, shape details are often lost during the down-sampling process, which results in discontinuous segmentation or inaccurate segmentation boundary. In order to compensate for the loss of shape information, two shape-related auxiliary tasks (i.e., boundary prediction and distance estimation) were jointly learned with building segmentation task in our proposed network. Meanwhile, two consistency constraint losses were designed based on the multi-task network to exploit the duality between the mask prediction and two shape-related information predictions. Specifically, an atrous spatial pyramid pooling (ASPP) module was appended to the top of the encoder of a U-shaped network to obtain multi-scale features. Based on the multi-scale features, one regression loss and two classification losses were used for predicting the distance-transform map, segmentation, and boundary. Two inter-task consistency-loss functions were constructed to ensure the consistency between distance maps and masks, and the consistency between masks and boundary maps. Experimental results on three public aerial image data sets showed that our method achieved superior performance over the recent state-of-the-art models.

Download Full-text

CRITICAL ASSESSMENT OF OBJECT SEGMENTATION IN AERIAL IMAGE USING GEO-HAUSDORFF DISTANCE

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprsarchives-xli-b4-187-2016 ◽

2016 ◽

Vol XLI-B4 ◽

pp. 187-194 ◽

Cited By ~ 1

Author(s):

H. Sun ◽

Y. Ding ◽

Y. Huang ◽

G. Wang

Keyword(s):

Hausdorff Distance ◽

Object Segmentation ◽

Signal To Noise Ratio ◽

Limited Resource ◽

Ground Truth ◽

Aerial Images ◽

Aerial Image ◽

Spatial Distance ◽

Valuable Insight ◽

Spatial Correspondence

Aerial Image records the large-range earth objects with the ever-improving spatial and radiometric resolution. It becomes a powerful tool for earth observation, land-coverage survey, geographical census, etc., and helps delineating the boundary of different kinds of objects on the earth both manually and automatically. In light of the geo-spatial correspondence between the pixel locations of aerial image and the spatial coordinates of ground objects, there is an increasing need of super-pixel segmentation and high-accuracy positioning of objects in aerial image. Besides the commercial software package of eCognition and ENVI, many algorithms have also been developed in the literature to segment objects of aerial images. But how to evaluate the segmentation results remains a challenge, especially in the context of the geo-spatial correspondence. The Geo-Hausdorff Distance (GHD) is proposed to measure the geo-spatial distance between the results of various object segmentation that can be done with the manual ground truth or with the automatic algorithms.Based on the early-breaking and random-sampling design, the GHD calculates the geographical Hausdorff distance with nearly-linear complexity. Segmentation results of several state-of-the-art algorithms, including those of the commercial packages, are evaluated with a diverse set of aerial images. They have different signal-to-noise ratio around the object boundaries and are hard to trace correctly even for human operators. The GHD value is analyzed to comprehensively measure the suitability of different object segmentation methods for aerial images of different spatial resolution. By critically assessing the strengths and limitations of the existing algorithms, the paper provides valuable insight and guideline for extensive research in automating object detection and classification of aerial image in the nation-wide geographic census. It is also promising for the optimal design of operational specification of remote sensing interpretation under the constraints of limited resource.

Download Full-text

Deep Learning Segmentation and 3D Reconstruction of Road Markings Using Multiview Aerial Imagery

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi8010047 ◽

2019 ◽

Vol 8 (1) ◽

pp. 47 ◽

Cited By ~ 1

Author(s):

Franz Kurz ◽

Seyed Azimi ◽

Chun-Yu Sheu ◽

Pablo d’Angelo

Keyword(s):

3D Reconstruction ◽

Autonomous Vehicles ◽

Automatic Segmentation ◽

Image Data ◽

Autonomous Driving ◽

Aerial Imagery ◽

Aerial Images ◽

Aerial Image ◽

Matching Problem ◽

Convolutional Network

The 3D information of road infrastructures is growing in importance with the development of autonomous driving. In this context, the exact 2D position of road markings as well as height information play an important role in, e.g., lane-accurate self-localization of autonomous vehicles. In this paper, the overall task is divided into an automatic segmentation followed by a refined 3D reconstruction. For the segmentation task, we applied a wavelet-enhanced fully convolutional network on multiview high-resolution aerial imagery. Based on the resulting 2D segments in the original images, we propose a successive workflow for the 3D reconstruction of road markings based on a least-squares line-fitting in multiview imagery. The 3D reconstruction exploits the line character of road markings with the aim to optimize the best 3D line location by minimizing the distance from its back projection to the detected 2D line in all the covering images. Results showed an improved IoU of the automatic road marking segmentation by exploiting the multiview character of the aerial images and a more accurate 3D reconstruction of the road surface compared to the semiglobal matching (SGM) algorithm. Further, the approach avoids the matching problem in non-textured image parts and is not limited to lines of finite length. In this paper, the approach is presented and validated on several aerial image data sets covering different scenarios like motorways and urban regions.

Download Full-text

Multiple-Oriented and Small Object Detection with Convolutional Neural Networks for Aerial Image

Remote Sensing ◽

10.3390/rs11182176 ◽

2019 ◽

Vol 11 (18) ◽

pp. 2176 ◽

Cited By ~ 3

Author(s):

Chen ◽

Zhong ◽

Tan

Keyword(s):

Neural Networks ◽

Object Detection ◽

Convolutional Neural Networks ◽

Aerial Images ◽

Superior Performance ◽

Aerial Image ◽

Detection Accuracy ◽

Small Object ◽

Data Set ◽

Orientation Information

Detecting objects in aerial images is a challenging task due to multiple orientations and relatively small size of the objects. Although many traditional detection models have demonstrated an acceptable performance by using the imagery pyramid and multiple templates in a sliding-window manner, such techniques are inefficient and costly. Recently, convolutional neural networks (CNNs) have successfully been used for object detection, and they have demonstrated considerably superior performance than that of traditional detection methods; however, this success has not been expanded to aerial images. To overcome such problems, we propose a detection model based on two CNNs. One of the CNNs is designed to propose many object-like regions that are generated from the feature maps of multi scales and hierarchies with the orientation information. Based on such a design, the positioning of small size objects becomes more accurate, and the generated regions with orientation information are more suitable for the objects arranged with arbitrary orientations. Furthermore, another CNN is designed for object recognition; it first extracts the features of each generated region and subsequently makes the final decisions. The results of the extensive experiments performed on the vehicle detection in aerial imagery (VEDAI) and overhead imagery research data set (OIRDS) datasets indicate that the proposed model performs well in terms of not only the detection accuracy but also the detection speed.

Download Full-text

EFN: Field-Based Object Detection for Aerial Images

Remote Sensing ◽

10.3390/rs12213630 ◽

2020 ◽

Vol 12 (21) ◽

pp. 3630

Author(s):

Jin Liu ◽

Haokun Zheng

Keyword(s):

Object Detection ◽

Semantic Segmentation ◽

Natural Images ◽

Aerial Images ◽

Aerial Image ◽

Data Sets ◽

Data Set ◽

Bounding Box ◽

Good Score ◽

Bounding Boxes

Object detection and recognition in aerial and remote sensing images has become a hot topic in the field of computer vision in recent years. As these images are usually taken from a bird’s-eye view, the targets often have different shapes and are densely arranged. Therefore, using an oriented bounding box to mark the target is a mainstream choice. However, this general method is designed based on horizontal box annotation, while the improved method for detecting an oriented bounding box has a high computational complexity. In this paper, we propose a method called ellipse field network (EFN) to organically integrate semantic segmentation and object detection. It predicts the probability distribution of the target and obtains accurate oriented bounding boxes through a post-processing step. We tested our method on the HRSC2016 and DOTA data sets, achieving mAP values of 0.863 and 0.701, respectively. At the same time, we also tested the performance of EFN on natural images and obtained a mAP of 84.7 in the VOC2012 data set. These extensive experiments demonstrate that EFN can achieve state-of-the-art results in aerial image tests and can obtain a good score when considering natural images.

Download Full-text

Mobile Robot Navigation Utilizing the WEB Based Aerial Images Without Prior Teaching Run

Journal of Robotics and Mechatronics ◽

10.20965/jrm.2017.p0697 ◽

2017 ◽

Vol 29 (4) ◽

pp. 697-705 ◽

Cited By ~ 3

Author(s):

Satoshi Muramatsu ◽

Tetsuo Tomizawa ◽

Shunsuke Kudoh ◽

Takashi Suehiro ◽

◽

...

Keyword(s):

Image Data ◽

Aerial Images ◽

Mobile Robot Navigation ◽

Aerial Image ◽

Map Matching ◽

Robot Localization ◽

Web Based ◽

Map Data ◽

Component Map ◽

Robot Position

In order to realize the work of goods conveyance etc. by robot, localization of robot position is fundamental technology component. Map matching methods is one of the localization technique. In map matching method, usually, to create the map data for localization, we have to operate the robot and measure the environment (teaching run). This operation requires a lot of time and work. In recent years, due to improved Internet services, aerial image data is easily obtained from Google Maps etc. Therefore, we utilize the aerial images as a map data to for mobile robots localization and navigation without teaching run. In this paper, we proposed the robot localization and navigation technique using aerial images. We verified the proposed technique by the localization and autonomous running experiment.

Download Full-text

TERRAIN AWARE IMAGE CLIPPING FOR REAL-TIME AERIAL MAPPING

ISPRS Annals of Photogrammetry Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-annals-iv-1-61-2018 ◽

2018 ◽

Vol IV-1 ◽

pp. 61-68

Author(s):

D. Hein ◽

R. Berger

Keyword(s):

Real Time ◽

Image Data ◽

Aerial Images ◽

Aerial Image ◽

Sensing Applications ◽

Data Links ◽

Map Generation ◽

Aerial Mapping ◽

Using Data ◽

Terrain Elevation

Abstract. Many remote sensing applications demand for a fast and efficient way of generating orthophoto maps from raw aerial images. One prerequisite is direct georeferencing, which allows to geolocate aerial images to their geographic position on the earth’s surface. But this is only half the story. When dealing with a large quantity of highly overlapping images, a major challenge is to select the most suitable image parts in order to generate seamless aerial maps of the captured area. This paper proposes a method that quickly determines such an optimal (rectangular) section for each single aerial image, which in turn can be used for generating seamless aerial maps. Its key approach is to clip aerial images depending on their geometric intersections with a terrain elevation model of the captured area, which is why we call it terrain aware image clipping (TAC). The method has a modest computational footprint and is therefore applicable even for rather limited embedded vision systems. It can be applied for both, real-time aerial mapping applications using data links as well as for rapid map generation right after landing without any postprocessing step. Referring to real-time applications, this method also minimizes transmission of redundant image data. The proposed method has already been demonstrated in several search-and-rescue scenarios and real-time mapping applications using a broadband data link and diffent kinds of camera and carrier systems. Moreover, a patent for this technology is pending.

Download Full-text

Learning Oriented Region-based Convolutional Neural Networks for Building Detection in Satellite Remote Sensing Images

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xlii-1-w1-461-2017 ◽

2017 ◽

Vol XLII-1/W1 ◽

pp. 461-464 ◽

Cited By ~ 12

Author(s):

C. Chen ◽

W. Gong ◽

Y. Hu ◽

Y. Chen ◽

Y. Ding

Keyword(s):

Remote Sensing ◽

Object Detection ◽

Fundamental Problem ◽

Aerial Images ◽

Aerial Image ◽

Building Detection ◽

Data Set ◽

Novel Model ◽

Oriented Layer ◽

Oriented Region

The automated building detection in aerial images is a fundamental problem encountered in aerial and satellite images analysis. Recently, thanks to the advances in feature descriptions, Region-based CNN model (R-CNN) for object detection is receiving an increasing attention. Despite the excellent performance in object detection, it is problematic to directly leverage the features of R-CNN model for building detection in single aerial image. As we know, the single aerial image is in vertical view and the buildings possess significant directional feature. However, in R-CNN model, direction of the building is ignored and the detection results are represented by horizontal rectangles. For this reason, the detection results with horizontal rectangle cannot describe the building precisely. To address this problem, in this paper, we proposed a novel model with a key feature related to orientation, namely, Oriented R-CNN (OR-CNN). Our contributions are mainly in the following two aspects: 1) Introducing a new oriented layer network for detecting the rotation angle of building on the basis of the successful VGG-net R-CNN model; 2) the oriented rectangle is proposed to leverage the powerful R-CNN for remote-sensing building detection. In experiments, we establish a complete and bran-new data set for training our oriented R-CNN model and comprehensively evaluate the proposed method on a publicly available building detection data set. We demonstrate State-of-the-art results compared with the previous baseline methods.

Download Full-text

Comparison of Classical Computer Vision vs. Convolutional Neural Networks for Weed Mapping in Aerial Images

Revista de Informática Teórica e Aplicada ◽

10.22456/2175-2745.97835 ◽

2020 ◽

Vol 27 (4) ◽

pp. 20-33

Author(s):

Paulo César Pereira Júnior ◽

Alexandre Monteiro ◽

Rafael Da Luz Ribeiro ◽

Antonio Carlos Sobieranski ◽

Aldo Von Wangenheim

Keyword(s):

Neural Networks ◽

Computer Vision ◽

Convolutional Neural Networks ◽

Precision Agriculture ◽

Ground Truth ◽

Aerial Images ◽

Weed Mapping ◽

Classical Models ◽

Classical Computer ◽

Better Than

In this paper, we present a comparison between convolutional neural networks and classicalcomputer vision approaches, for the specific precision agriculture problem of weed mapping on sugarcane fields aerial images. A systematic literature review was conducted to find which computer vision methods are being used on this specific problem. The most cited methods were implemented, as well as four models of convolutional neural networks. All implemented approaches were tested using the same dataset, and their results were quantitatively and qualitatively analyzed. The obtained results were compared to a human expert made ground truth, for validation. The results indicate that the convolutional neural networks present better precision and generalize better than the classical models

Download Full-text

A NEW PARADIGM FOR MATCHING UAV- AND AERIAL IMAGES

ISPRS Annals of Photogrammetry Remote Sensing and Spatial Information Sciences ◽

10.5194/isprsannals-iii-3-83-2016 ◽

2016 ◽

Vol III-3 ◽

pp. 83-90 ◽

Cited By ~ 3

Author(s):

T. Koch ◽

X. Zhuo ◽

P. Reinartz ◽

F. Fraundorfer

Keyword(s):

Real World ◽

Image Matching ◽

Feature Matching ◽

Ground Truth ◽

Geometric Constraints ◽

Feature Point ◽

Aerial Images ◽

Aerial Image ◽

Ratio Test ◽

New Paradigm

This paper investigates the performance of SIFT-based image matching regarding large differences in image scaling and rotation, as this is usually the case when trying to match images captured from UAVs and airplanes. This task represents an essential step for image registration and 3d-reconstruction applications. Various real world examples presented in this paper show that SIFT, as well as A-SIFT perform poorly or even fail in this matching scenario. Even if the scale difference in the images is known and eliminated beforehand, the matching performance suffers from too few feature point detections, ambiguous feature point orientations and rejection of many correct matches when applying the ratio-test afterwards. Therefore, a new feature matching method is provided that overcomes these problems and offers thousands of matches by a novel feature point detection strategy, applying a one-to-many matching scheme and substitute the ratio-test by adding geometric constraints to achieve geometric correct matches at repetitive image regions. This method is designed for matching almost nadir-directed images with low scene depth, as this is typical in UAV and aerial image matching scenarios. We tested the proposed method on different real world image pairs. While standard SIFT failed for most of the datasets, plenty of geometrical correct matches could be found using our approach. Comparing the estimated fundamental matrices and homographies with ground-truth solutions, mean errors of few pixels can be achieved.

Download Full-text