scholarly journals Improved YOLO Network for Free-Angle Remote Sensing Target Detection

2021 ◽  
Vol 13 (11) ◽  
pp. 2171
Author(s):  
Yuhao Qing ◽  
Wenyi Liu ◽  
Liuyan Feng ◽  
Wanjia Gao

Despite significant progress in object detection tasks, remote sensing image target detection is still challenging owing to complex backgrounds, large differences in target sizes, and uneven distribution of rotating objects. In this study, we consider model accuracy, inference speed, and detection of objects at any angle. We also propose a RepVGG-YOLO network using an improved RepVGG model as the backbone feature extraction network, which performs the initial feature extraction from the input image and considers network training accuracy and inference speed. We use an improved feature pyramid network (FPN) and path aggregation network (PANet) to reprocess feature output by the backbone network. The FPN and PANet module integrates feature maps of different layers, combines context information on multiple scales, accumulates multiple features, and strengthens feature information extraction. Finally, to maximize the detection accuracy of objects of all sizes, we use four target detection scales at the network output to enhance feature extraction from small remote sensing target pixels. To solve the angle problem of any object, we improved the loss function for classification using circular smooth label technology, turning the angle regression problem into a classification problem, and increasing the detection accuracy of objects at any angle. We conducted experiments on two public datasets, DOTA and HRSC2016. Our results show the proposed method performs better than previous methods.

2021 ◽  
Vol 24 (68) ◽  
pp. 21-32
Author(s):  
Yaming Cao ◽  
ZHEN YANG ◽  
CHEN GAO

Convolutional neural networks (CNNs) have shown strong learning capabilities in computer vision tasks such as classification and detection. Especially with the introduction of excellent detection models such as YOLO (V1, V2 and V3) and Faster R-CNN, CNNs have greatly improved detection efficiency and accuracy. However, due to the special angle of view, small size, few features, and complicated background, CNNs that performs well in the ground perspective dataset, fails to reach a good detection accuracy in the remote sensing image dataset. To this end, based on the YOLO V3 model, we used feature maps of different depths as detection outputs to explore the reasons for the poor detection rate of small targets in remote sensing images by deep neural networks. We also analyzed the effect of neural network depth on small target detection, and found that the excessive deep semantic information of neural network has little effect on small target detection. Finally, the verification on the VEDAI dataset shows, that the fusion of shallow feature maps with precise location information and deep feature maps with rich semantics in the CNNs can effectively improve the accuracy of small target detection in remote sensing images.


2021 ◽  
Vol 13 (11) ◽  
pp. 2207
Author(s):  
Fengcheng Ji ◽  
Dongping Ming ◽  
Beichen Zeng ◽  
Jiawei Yu ◽  
Yuanzhao Qing ◽  
...  

Aircraft is a means of transportation and weaponry, which is crucial for civil and military fields to detect from remote sensing images. However, detecting aircraft effectively is still a problem due to the diversity of the pose, size, and position of the aircraft and the variety of objects in the image. At present, the target detection methods based on convolutional neural networks (CNNs) lack the sufficient extraction of remote sensing image information and the post-processing of detection results, which results in a high missed detection rate and false alarm rate when facing complex and dense targets. Aiming at the above questions, we proposed a target detection model based on Faster R-CNN, which combines multi-angle features driven and majority voting strategy. Specifically, we designed a multi-angle transformation module to transform the input image to realize the multi-angle feature extraction of the targets in the image. In addition, we added a majority voting mechanism at the end of the model to deal with the results of the multi-angle feature extraction. The average precision (AP) of this method reaches 94.82% and 95.25% on the public and private datasets, respectively, which are 6.81% and 8.98% higher than that of the Faster R-CNN. The experimental results show that the method can detect aircraft effectively, obtaining better performance than mature target detection networks.


2020 ◽  
Vol 12 (20) ◽  
pp. 3316 ◽  
Author(s):  
Yulian Zhang ◽  
Lihong Guo ◽  
Zengfa Wang ◽  
Yang Yu ◽  
Xinwei Liu ◽  
...  

Intelligent detection and recognition of ships from high-resolution remote sensing images is an extraordinarily useful task in civil and military reconnaissance. It is difficult to detect ships with high precision because various disturbances are present in the sea such as clouds, mist, islands, coastlines, ripples, and so on. To solve this problem, we propose a novel ship detection network based on multi-layer convolutional feature fusion (CFF-SDN). Our ship detection network consists of three parts. Firstly, the convolutional feature extraction network is used to extract ship features of different levels. Residual connection is introduced so that the model can be designed very deeply, and it is easy to train and converge. Secondly, the proposed network fuses fine-grained features from shallow layers with semantic features from deep layers, which is beneficial for detecting ship targets with different sizes. At the same time, it is helpful to improve the localization accuracy and detection accuracy of small objects. Finally, multiple fused feature maps are used for classification and regression, which can adapt to ships of multiple scales. Since the CFF-SDN model uses a pruning strategy, the detection speed is greatly improved. In the experiment, we create a dataset for ship detection in remote sensing images (DSDR), including actual satellite images from Google Earth and aerial images from electro-optical pod. The DSDR dataset contains not only visible light images, but also infrared images. To improve the robustness to various sea scenes, images under different scales, perspectives and illumination are obtained through data augmentation or affine transformation methods. To reduce the influence of atmospheric absorption and scattering, a dark channel prior is adopted to solve atmospheric correction on the sea scenes. Moreover, soft non-maximum suppression (NMS) is introduced to increase the recall rate for densely arranged ships. In addition, better detection performance is observed in comparison with the existing models in terms of precision rate and recall rate. The experimental results show that the proposed detection model can achieve the superior performance of ship detection in optical remote sensing image.


2019 ◽  
Vol 11 (19) ◽  
pp. 2276
Author(s):  
Jae-Hun Lee ◽  
Sanghoon Sull

The estimation of ground sampling distance (GSD) from a remote sensing image enables measurement of the size of an object as well as more accurate segmentation in the image. In this paper, we propose a regression tree convolutional neural network (CNN) for estimating the value of GSD from an input image. The proposed regression tree CNN consists of a feature extraction CNN and a binomial tree layer. The proposed network first extracts features from an input image. Based on the extracted features, it predicts the GSD value that is represented by the floating-point number with the exponent and its mantissa. They are computed by coarse scale classification and finer scale regression, respectively, resulting in improved results. Experimental results with a Google Earth aerial image dataset and a mixed dataset consisting of eight remote sensing image public datasets with different GSDs show that the proposed network reduces the GSD prediction error rate by 25% compared to a baseline network that directly estimates the GSD.


Author(s):  
Haoze Sun ◽  
Tianqing Chang ◽  
Lei Zhang ◽  
Guozhen Yang ◽  
Bin Han ◽  
...  

Armored equipment plays a crucial role in the ground battlefield. The fast and accurate detection of enemy armored targets is significant to take the initiative in the battlefield. Comparing to general object detection and vehicle detection, armored target detection in battlefield environment is more challenging due to the long distance of observation and the complicated environment. In this paper, an accurate and robust automatic detection method is proposed to detect armored targets in battlefield environment. Firstly, inspired by Feature Pyramid Network (FPN), we propose a top-down aggregation (TDA) network which enhances shallow feature maps by aggregating semantic information from deeper layers. Then, using the proposed TDA network in a basic Faster R-CNN framework, we explore the further optimization of the approach for armored target detection: for the Region of Interest (RoI) Proposal Network (RPN), we propose a multi-branch RPNs framework to generate proposals that match the scale of armored targets and the specific receptive field of each aggregated layer and design hierarchical loss for the multi-branch RPNs; for RoI Classifier Network (RCN), we apply RoI pooling on the single finest scale feature map and construct a light and fast detection network. To evaluate our method, comparable experiments with state-of-art detection methods were conducted on a challenging dataset of images with armored targets. The experimental results demonstrate the effectiveness of the proposed method in terms of detection accuracy and recall rate.


2021 ◽  
Author(s):  
Lakpa Dorje Tamang

In this paper, we propose a symmetric series convolutional neural network (SS-CNN), which is a novel deep convolutional neural network (DCNN)-based super-resolution (SR) technique for ultrasound medical imaging. The proposed model comprises two parts: a feature extraction network (FEN) and an up-sampling layer. In the FEN, the low-resolution (LR) counterpart of the ultrasound image passes through a symmetric series of two different DCNNs. The low-level feature maps obtained from the subsequent layers of both DCNNs are concatenated in a feed forward manner, aiding in robust feature extraction to ensure high reconstruction quality. Subsequently, the final concatenated features serve as an input map to the latter 2D convolutional layers, where the textural information of the input image is connected via skip connections. The second part of the proposed model is a sub-pixel convolutional (SPC) layer, which up-samples the output of the FEN by multiplying it with a multi-dimensional kernel followed by a periodic shuffling operation to reconstruct a high-quality SR ultrasound image. We validate the performance of the SS-CNN with publicly available ultrasound image datasets. Experimental results show that the proposed model achieves an exquisite reconstruction performance of ultrasound image over the conventional methods in terms of peak signal-to-noise ratio (PSNR), and structural similarity index (SSIM), while providing compelling SR reconstruction time.


2020 ◽  
Vol 12 (24) ◽  
pp. 4046
Author(s):  
Xirong Li ◽  
Fangling Pu ◽  
Rui Yang ◽  
Rong Gui ◽  
Xin Xu

In recent years, deep neural network (DNN) based scene classification methods have achieved promising performance. However, the data-driven training strategy requires a large number of labeled samples, making the DNN-based methods unable to solve the scene classification problem in the case of a small number of labeled images. As the number and variety of scene images continue to grow, the cost and difficulty of manual annotation also increase. Therefore, it is significant to deal with the scene classification problem with only a few labeled samples. In this paper, we propose an attention metric network (AMN) in the framework of the few-shot learning (FSL) to improve the performance of one-shot scene classification. AMN is composed of a self-attention embedding network (SAEN) and a cross-attention metric network (CAMN). In SAEN, we adopt the spatial attention and the channel attention of feature maps to obtain abundant features of scene images. In CAMN, we propose a novel cross-attention mechanism which can highlight the features that are more concerned about different categories, and improve the similarity measurement performance. A loss function combining mean square error (MSE) loss with multi-class N-pair loss is developed, which helps to promote the intra-class similarity and inter-class variance of embedding features, and also improve the similarity measurement results. Experiments on the NWPU-RESISC45 dataset and the RSD-WHU46 dataset demonstrate that our method achieves the state-of-the-art results on one-shot remote sensing image scene classification tasks.


2021 ◽  
Vol 2021 ◽  
pp. 1-6
Author(s):  
Yi Lv ◽  
Zhengbo Yin ◽  
Zhezhou Yu

In order to improve the accuracy of remote sensing image target detection, this paper proposes a remote sensing image target detection algorithm DFS based on deep learning. Firstly, dimension clustering module, loss function, and sliding window segmentation detection are designed. The data set used in the experiment comes from GoogleEarth, and there are 6 types of objects: airplanes, boats, warehouses, large ships, bridges, and ports. Training set, verification set, and test set contain 73490 images, 22722 images, and 2138 images, respectively. It is assumed that the number of detected positive samples and negative samples is A and B, respectively, and the number of undetected positive samples and negative samples is C and D, respectively. The experimental results show that the precision-recall curve of DFS for six types of targets shows that DFS has the best detection effect for bridges and the worst detection effect for boats. The main reason is that the size of the bridge is relatively large, and it is clearly distinguished from the background in the image, so the detection difficulty is low. However, the target of the boat is very small, and it is easy to be mixed with the background, so it is difficult to detect. The MAP of DFS is improved by 12.82%, the detection accuracy is improved by 13%, and the recall rate is slightly decreased by 1% compared with YOLOv2. According to the number of detection targets, the number of false positives (FPs) of DFS is much less than that of YOLOv2. The false positive rate is greatly reduced. In addition, the average IOU of DFS is 11.84% higher than that of YOLOv2. For small target detection efficiency and large remote sensing image detection, the DFS algorithm has obvious advantages.


2020 ◽  
Vol 12 (3) ◽  
pp. 389 ◽  
Author(s):  
Yangyang Li ◽  
Qin Huang ◽  
Xuan Pei ◽  
Licheng Jiao ◽  
Ronghua Shang

Object detection has made significant progress in many real-world scenes. Despite this remarkable progress, the common use case of detection in remote sensing images remains challenging even for leading object detectors, due to the complex background, objects with arbitrary orientation, and large difference in scale of objects. In this paper, we propose a novel rotation detector for remote sensing images, mainly inspired by Mask R-CNN, namely RADet. RADet can obtain the rotation bounding box of objects with shape mask predicted by the mask branch, which is a novel, simple and effective way to get the rotation bounding box of objects. Specifically, a refine feature pyramid network is devised with an improved building block constructing top-down feature maps, to solve the problem of large difference in scales. Meanwhile, the position attention network and the channel attention network are jointly explored by modeling the spatial position dependence between global pixels and highlighting the object feature, for detecting small object surrounded by complex background. Extensive experiments on two remote sensing public datasets, DOTA and NWPUVHR -10, show our method to outperform existing leading object detectors in remote sensing field.


Electronics ◽  
2019 ◽  
Vol 8 (10) ◽  
pp. 1151 ◽  
Author(s):  
Xia Hua ◽  
Xinqing Wang ◽  
Ting Rui ◽  
Dong Wang ◽  
Faming Shao

Aiming at the real-time detection of multiple objects and micro-objects in large-scene remote sensing images, a cascaded convolutional neural network real-time object-detection framework for remote sensing images is proposed, which integrates visual perception and convolutional memory network reasoning. The detection framework is composed of two fully convolutional networks, namely, the strengthened object self-attention pre-screening fully convolutional network (SOSA-FCN) and the object accurate detection fully convolutional network (AD-FCN). SOSA-FCN introduces a self-attention module to extract attention feature maps and constructs a depth feature pyramid to optimize the attention feature maps by combining convolutional long-term and short-term memory networks. It guides the acquisition of potential sub-regions of the object in the scene, reduces the computational complexity, and enhances the network’s ability to extract multi-scale object features. It adapts to the complex background and small object characteristics of a large-scene remote sensing image. In AD-FCN, the object mask and object orientation estimation layer are designed to achieve fine positioning of candidate frames. The performance of the proposed algorithm is compared with that of other advanced methods on NWPU_VHR-10, DOTA, UCAS-AOD, and other open datasets. The experimental results show that the proposed algorithm significantly improves the efficiency of object detection while ensuring detection accuracy and has high adaptability. It has extensive engineering application prospects.


Sign in / Sign up

Export Citation Format

Share Document