AUTOMATIC OBJECT EXTRACTION FROM HIGH RESOLUTION AERIAL IMAGERY WITH SIMPLE LINEAR ITERATIVE CLUSTERING AND CONVOLUTIONAL NEURAL NETWORKS

<p><strong>Abstract.</strong> Recent advances in machine learning techniques for image classification have led to the development of robust approaches to both object detection and extraction. Traditional CNN architectures, such as LeNet, AlexNet and CaffeNet, usually use as input images of fixed sizes taken from objects and attempt to assign labels to those images. Another possible approach is the Fast Region-based CNN (or Fast R-CNN), which works by using two models: (i) a Region Proposal Network (RPN) which generates a set of potential Regions of Interest (RoI) in the image; and (ii) a traditional CNN which assigns labels to the proposed RoI. As an alternative, this study proposes an approach to automatic object extraction from aerial images similar to the Fast R-CNN architecture, the main difference being the use of the Simple Linear Iterative Clustering (SLIC) algorithm instead of an RPN to generate the RoI. The dataset used is composed of high-resolution aerial images and the following classes were considered: house, sport court, hangar, building, swimming pool, tree, and street/road. The proposed method can generate RoI with different sizes by running a multi-scale SLIC approach. The overall accuracy obtained for object detection was 89% and the major advantage is that the proposed method is capable of semantic segmentation by assigning a label to each selected RoI. Some of the problems encountered are related to object proximity, in which different instances appeared merged in the results.</p>

Download Full-text

Towards Vulnerability Mapping on High Resolution Aerial Images: Roof Detection, GIS, and Machine Learning Techniques

2019 IEEE Global Humanitarian Technology Conference (GHTC) ◽

10.1109/ghtc46095.2019.9033030 ◽

2019 ◽

Author(s):

D. K. Bardeloza ◽

N. J. Libatique ◽

G. L. Tangonan ◽

M. C. T. Vicente ◽

J. L. Honrado

Keyword(s):

Machine Learning ◽

High Resolution ◽

Aerial Images ◽

Vulnerability Mapping ◽

Machine Learning Techniques ◽

Learning Techniques

Download Full-text

A Comparison of Machine Learning Techniques to Extract Human Settlements from High Resolution Imagery

IGARSS 2018 - 2018 IEEE International Geoscience and Remote Sensing Symposium ◽

10.1109/igarss.2018.8518528 ◽

2018 ◽

Author(s):

Jeanette Weaver ◽

Brian Moore ◽

Andrew Reith ◽

Jacob McKee ◽

Dalton Lunga

Keyword(s):

Machine Learning ◽

High Resolution ◽

Machine Learning Techniques ◽

Human Settlements ◽

High Resolution Imagery ◽

Learning Techniques

Download Full-text

Analysis Of High Resolution Aerial Images For Object Detection

10.1117/12.960453 ◽

1989 ◽

Author(s):

Mohan M. Trivedi ◽

Amol G. Bokil ◽

Mourad B. Takla ◽

George B. Maksymonko ◽

J. Thomas Broach

Keyword(s):

High Resolution ◽

Object Detection ◽

Aerial Images

Download Full-text

Extended Feature Pyramid Network with Adaptive Scale Training Strategy and Anchors for Object Detection in Aerial Images

Remote Sensing ◽

10.3390/rs12050784 ◽

2020 ◽

Vol 12 (5) ◽

pp. 784 ◽

Cited By ~ 1

Author(s):

Wei Guo ◽

Weihong Li ◽

Weiguo Gong ◽

Jinkai Cui

Keyword(s):

Neural Network ◽

Object Detection ◽

Semantic Information ◽

Aerial Images ◽

Training Strategy ◽

The Public ◽

Multi Scale ◽

Shallow Layer ◽

The Neural Network ◽

Feature Pyramid

Multi-scale object detection is a basic challenge in computer vision. Although many advanced methods based on convolutional neural networks have succeeded in natural images, the progress in aerial images has been relatively slow mainly due to the considerably huge scale variations of objects and many densely distributed small objects. In this paper, considering that the semantic information of the small objects may be weakened or even disappear in the deeper layers of neural network, we propose a new detection framework called Extended Feature Pyramid Network (EFPN) for strengthening the information extraction ability of the neural network. In the EFPN, we first design the multi-branched dilated bottleneck (MBDB) module in the lateral connections to capture much more semantic information. Then, we further devise an attention pathway for better locating the objects. Finally, an augmented bottom-up pathway is conducted for making shallow layer information easier to spread and further improving performance. Moreover, we present an adaptive scale training strategy to enable the network to better recognize multi-scale objects. Meanwhile, we present a novel clustering method to achieve adaptive anchors and make the neural network better learn data features. Experiments on the public aerial datasets indicate that the presented method obtain state-of-the-art performance.

Download Full-text

An ensemble architecture of deep convolutional Segnet and Unet networks for building semantic segmentation from high-resolution aerial images

Geocarto International ◽

10.1080/10106049.2020.1856199 ◽

2020 ◽

pp. 1-16

Author(s):

Abolfazl Abdollahi ◽

Biswajeet Pradhan ◽

Abdullah M. Alamri

Keyword(s):

High Resolution ◽

Semantic Segmentation ◽

Aerial Images

Download Full-text

Semantic Segmentation Network for Surface Defect Detection of Automobile Wheel Hub Fusing High-Resolution Feature and Multi-Scale Feature

Applied Sciences ◽

10.3390/app112210508 ◽

2021 ◽

Vol 11 (22) ◽

pp. 10508

Author(s):

Chaowei Tang ◽

Xinxin Feng ◽

Haotian Wen ◽

Xu Zhou ◽

Yanqing Shao ◽

...

Keyword(s):

High Resolution ◽

Defect Detection ◽

Automobile Industry ◽

Surface Defect ◽

Semantic Segmentation ◽

The Body ◽

Multi Scale ◽

Surface Defect Detection ◽

Edge Features ◽

Automobile Wheel

Surface defect detection of an automobile wheel hub is important to the automobile industry because these defects directly affect the safety and appearance of automobiles. At present, surface defect detection networks based on convolutional neural network use many pooling layers when extracting features, reducing the spatial resolution of features and preventing the accurate detection of the boundary of defects. On the basis of DeepLab v3+, we propose a semantic segmentation network for the surface defect detection of an automobile wheel hub. To solve the gridding effect of atrous convolution, the high-resolution network (HRNet) is used as the backbone network to extract high-resolution features, and the multi-scale features extracted by the Atrous Spatial Pyramid Pooling (ASPP) of DeepLab v3+ are superimposed. On the basis of the optical flow, we decouple the body and edge features of the defects to accurately detect the boundary of defects. Furthermore, in the upsampling process, a decoder can accurately obtain detection results by fusing the body, edge, and multi-scale features. We use supervised training to optimize these features. Experimental results on four defect datasets (i.e., wheels, magnetic tiles, fabrics, and welds) show that the proposed network has better F1 score, average precision, and intersection over union than SegNet, Unet, and DeepLab v3+, proving that the proposed network is effective for different defect detection scenarios.

Download Full-text

Multi-Scale Cascade Guided Object Detection in Aerial Images

10.1109/igarss47720.2021.9553767 ◽

2021 ◽

Author(s):

Jiajia Liao ◽

Yingchao Piao ◽

Guorong Cai ◽

Yundong Wu ◽

Jinhe Su

Keyword(s):

Object Detection ◽

Aerial Images ◽

Multi Scale

Download Full-text

Object Detection in Very High-Resolution Aerial Images Using One-Stage Densely Connected Feature Pyramid Network

Sensors ◽

10.3390/s18103341 ◽

2018 ◽

Vol 18 (10) ◽

pp. 3341 ◽

Cited By ~ 40

Author(s):

Hilal Tayara ◽

Kil Chong

Keyword(s):

High Resolution ◽

Object Detection ◽

Computation Time ◽

Aerial Images ◽

Feature Maps ◽

Two Stage ◽

One Stage ◽

Wide Range ◽

Feature Pyramid ◽

Very High

Object detection in very high-resolution (VHR) aerial images is an essential step for a wide range of applications such as military applications, urban planning, and environmental management. Still, it is a challenging task due to the different scales and appearances of the objects. On the other hand, object detection task in VHR aerial images has improved remarkably in recent years due to the achieved advances in convolution neural networks (CNN). Most of the proposed methods depend on a two-stage approach, namely: a region proposal stage and a classification stage such as Faster R-CNN. Even though two-stage approaches outperform the traditional methods, their optimization is not easy and they are not suitable for real-time applications. In this paper, a uniform one-stage model for object detection in VHR aerial images has been proposed. In order to tackle the challenge of different scales, a densely connected feature pyramid network has been proposed by which high-level multi-scale semantic feature maps with high-quality information are prepared for object detection. This work has been evaluated on two publicly available datasets and outperformed the current state-of-the-art results on both in terms of mean average precision (mAP) and computation time.

Download Full-text

EFN: Field-Based Object Detection for Aerial Images

Remote Sensing ◽

10.3390/rs12213630 ◽

2020 ◽

Vol 12 (21) ◽

pp. 3630

Author(s):

Jin Liu ◽

Haokun Zheng

Keyword(s):

Object Detection ◽

Semantic Segmentation ◽

Natural Images ◽

Aerial Images ◽

Aerial Image ◽

Data Sets ◽

Data Set ◽

Bounding Box ◽

Good Score ◽

Bounding Boxes

Object detection and recognition in aerial and remote sensing images has become a hot topic in the field of computer vision in recent years. As these images are usually taken from a bird’s-eye view, the targets often have different shapes and are densely arranged. Therefore, using an oriented bounding box to mark the target is a mainstream choice. However, this general method is designed based on horizontal box annotation, while the improved method for detecting an oriented bounding box has a high computational complexity. In this paper, we propose a method called ellipse field network (EFN) to organically integrate semantic segmentation and object detection. It predicts the probability distribution of the target and obtains accurate oriented bounding boxes through a post-processing step. We tested our method on the HRSC2016 and DOTA data sets, achieving mAP values of 0.863 and 0.701, respectively. At the same time, we also tested the performance of EFN on natural images and obtained a mAP of 84.7 in the VOC2012 data set. These extensive experiments demonstrate that EFN can achieve state-of-the-art results in aerial image tests and can obtain a good score when considering natural images.

Download Full-text