Small Targets Detection for Transmission Tower Based on SRGAN and Faster RCNN

Author(s):  
Runze Liu ◽  
Guangwei Yan ◽  
Hui He ◽  
Yubin An ◽  
Ting Wang ◽  
...  

Background: Power line inspection is essential to ensure the safe and stable operation of the power system. Object detection for tower equipment can significantly improve inspection efficiency. However, due to the low resolution of small targets and limited features, the detection accuracy of small targets is not easy to improve. Objective: This study aimed to improve the tiny targets’ resolution while making the small target's texture and detailed features more prominent to be perceived by the detection model. Methods: In this paper, we propose an algorithm that employs generative adversarial networks to improve small objects' detection accuracy. First, the original image is converted into a super-resolution one by a super-resolution reconstruction network (SRGAN). Then the object detection framework Faster RCNN is utilized to detect objects on the super-resolution images. Result: The experimental results on two small object recognition datasets show that the model proposed in this paper has good robustness. It can especially detect the targets missed by Faster RCNN, which indicates that SRGAN can effectively enhance the detailed information of small targets by improving the resolution. Conclusion: We found that higher resolution data is conducive to obtaining more detailed information of small targets, which can help the detection algorithm achieve higher accuracy. The small object detection model based on the generative adversarial network proposed in this paper is feasible and more efficient. Compared with Faster RCNN, this model has better performance on small object detection.

2020 ◽  
Vol 12 (19) ◽  
pp. 3152
Author(s):  
Luc Courtrai ◽  
Minh-Tan Pham ◽  
Sébastien Lefèvre

This article tackles the problem of detecting small objects in satellite or aerial remote sensing images by relying on super-resolution to increase image spatial resolution, thus the size and details of objects to be detected. We show how to improve the super-resolution framework starting from the learning of a generative adversarial network (GAN) based on residual blocks and then its integration into a cycle model. Furthermore, by adding to the framework an auxiliary network tailored for object detection, we considerably improve the learning and the quality of our final super-resolution architecture, and more importantly increase the object detection performance. Besides the improvement dedicated to the network architecture, we also focus on the training of super-resolution on target objects, leading to an object-focused approach. Furthermore, the proposed strategies do not depend on the choice of a baseline super-resolution framework, hence could be adopted for current and future state-of-the-art models. Our experimental study on small vehicle detection in remote sensing data conducted on both aerial and satellite images (i.e., ISPRS Potsdam and xView datasets) confirms the effectiveness of the improved super-resolution methods to assist with the small object detection tasks.


2018 ◽  
Vol 2018 ◽  
pp. 1-10 ◽  
Author(s):  
Guo X. Hu ◽  
Zhong Yang ◽  
Lei Hu ◽  
Li Huang ◽  
Jia M. Han

The existing object detection algorithm based on the deep convolution neural network needs to carry out multilevel convolution and pooling operations to the entire image in order to extract a deep semantic features of the image. The detection models can get better results for big object. However, those models fail to detect small objects that have low resolution and are greatly influenced by noise because the features after repeated convolution operations of existing models do not fully represent the essential characteristics of the small objects. In this paper, we can achieve good detection accuracy by extracting the features at different convolution levels of the object and using the multiscale features to detect small objects. For our detection model, we extract the features of the image from their third, fourth, and 5th convolutions, respectively, and then these three scales features are concatenated into a one-dimensional vector. The vector is used to classify objects by classifiers and locate position information of objects by regression of bounding box. Through testing, the detection accuracy of our model for small objects is 11% higher than the state-of-the-art models. In addition, we also used the model to detect aircraft in remote sensing images and achieved good results.


Sensors ◽  
2021 ◽  
Vol 21 (15) ◽  
pp. 5194
Author(s):  
Hongfeng Wang ◽  
Jianzhong Wang ◽  
Kemeng Bai ◽  
Yong Sun

Despite the breakthroughs in accuracy and efficiency of object detection using deep neural networks, the performance of small object detection is far from satisfactory. Gaze estimation has developed significantly due to the development of visual sensors. Combining object detection with gaze estimation can significantly improve the performance of small object detection. This paper presents a centered multi-task generative adversarial network (CMTGAN), which combines small object detection and gaze estimation. To achieve this, we propose a generative adversarial network (GAN) capable of image super-resolution and two-stage small object detection. We exploit a generator in CMTGAN for image super-resolution and a discriminator for object detection. We introduce an artificial texture loss into the generator to retain the original feature of small objects. We also use a centered mask in the generator to make the network focus on the central part of images where small objects are more likely to appear in our method. We propose a discriminator with detection loss for two-stage small object detection, which can be adapted to other GANs for object detection. Compared with existing interpolation methods, the super-resolution images generated by CMTGAN are more explicit and contain more information. Experiments show that our method exhibits a better detection performance than mainstream methods.


Sensors ◽  
2021 ◽  
Vol 21 (9) ◽  
pp. 3031
Author(s):  
Jing Lian ◽  
Yuhang Yin ◽  
Linhui Li ◽  
Zhenghao Wang ◽  
Yafu Zhou

There are many small objects in traffic scenes, but due to their low resolution and limited information, their detection is still a challenge. Small object detection is very important for the understanding of traffic scene environments. To improve the detection accuracy of small objects in traffic scenes, we propose a small object detection method in traffic scenes based on attention feature fusion. First, a multi-scale channel attention block (MS-CAB) is designed, which uses local and global scales to aggregate the effective information of the feature maps. Based on this block, an attention feature fusion block (AFFB) is proposed, which can better integrate contextual information from different layers. Finally, the AFFB is used to replace the linear fusion module in the object detection network and obtain the final network structure. The experimental results show that, compared to the benchmark model YOLOv5s, this method has achieved a higher mean Average Precison (mAP) under the premise of ensuring real-time performance. It increases the mAP of all objects by 0.9 percentage points on the validation set of the traffic scene dataset BDD100K, and at the same time, increases the mAP of small objects by 3.5%.


2021 ◽  
Vol 13 (6) ◽  
pp. 1198
Author(s):  
Bi-Yuan Liu ◽  
Huai-Xin Chen ◽  
Zhou Huang ◽  
Xing Liu ◽  
Yun-Zhi Yang

Drone-based object detection has been widely applied in ground object surveillance, urban patrol, and some other fields. However, the dramatic scale changes and complex backgrounds of drone images usually result in weak feature representation of small objects, which makes it challenging to achieve high-precision object detection. Aiming to improve small objects detection, this paper proposes a novel cross-scale knowledge distillation (CSKD) method, which enhances the features of small objects in a manner similar to image enlargement, so it is termed as ZoomInNet. First, based on an efficient feature pyramid network structure, the teacher and student network are trained with images in different scales to introduce the cross-scale feature. Then, the proposed layer adaption (LA) and feature level alignment (FA) mechanisms are applied to align the feature size of the two models. After that, the adaptive key distillation point (AKDP) algorithm is used to get the crucial positions in feature maps that need knowledge distillation. Finally, the position-aware L2 loss is used to measure the difference between feature maps from cross-scale models, realizing the cross-scale information compression in a single model. Experiments on the challenging Visdrone2018 dataset show that the proposed method draws on the advantages of the image pyramid methods, while avoids the large calculation of them and significantly improves the detection accuracy of small objects. Simultaneously, the comparison with mainstream methods proves that our method has the best performance in small object detection.


Author(s):  
Seokyong Shin ◽  
Hyunho Han ◽  
Sang Hun Lee

YOLOv3 is a deep learning-based real-time object detector and is mainly used in applications such as video surveillance and autonomous vehicles. In this paper, we proposed an improved YOLOv3 (You Only Look Once version 3) applied Duplex FPN, which enhanced large object detection by utilizing low-level feature information. The conventional YOLOv3 improved the small object detection performance by applying FPN (Feature Pyramid Networks) structure to YOLOv2. However, YOLOv3 with an FPN structure specialized in detecting small objects, so it is difficult to detect large objects. Therefore, this paper proposed an improved YOLOv3 applied Duplex FPN, which can utilize low-level location information in high-level feature maps instead of the existing FPN structure of YOLOv3. This improved the detection accuracy of large objects. Also, an extra detection layer was added to the top-level feature map to prevent failure of detection of parts of large objects. Further, dimension clusters of each detection layer were reassigned to learn quickly how to accurately detect objects. The proposed method was compared and analyzed in the PASCAL VOC dataset. The experimental results showed that the bounding box accuracy of large objects improved owing to the Duplex FPN and extra detection layer, and the proposed method succeeded in detecting large objects that the existing YOLOv3 did not.


2021 ◽  
Vol 13 (23) ◽  
pp. 4779
Author(s):  
Xiangkai Xu ◽  
Zhejun Feng ◽  
Changqing Cao ◽  
Mengyuan Li ◽  
Jin Wu ◽  
...  

Remote sensing image object detection and instance segmentation are widely valued research fields. A convolutional neural network (CNN) has shown defects in the object detection of remote sensing images. In recent years, the number of studies on transformer-based models increased, and these studies achieved good results. However, transformers still suffer from poor small object detection and unsatisfactory edge detail segmentation. In order to solve these problems, we improved the Swin transformer based on the advantages of transformers and CNNs, and designed a local perception Swin transformer (LPSW) backbone to enhance the local perception of the network and to improve the detection accuracy of small-scale objects. We also designed a spatial attention interleaved execution cascade (SAIEC) network framework, which helped to strengthen the segmentation accuracy of the network. Due to the lack of remote sensing mask datasets, the MRS-1800 remote sensing mask dataset was created. Finally, we combined the proposed backbone with the new network framework and conducted experiments on this MRS-1800 dataset. Compared with the Swin transformer, the proposed model improved the mask AP by 1.7%, mask APS by 3.6%, AP by 1.1% and APS by 4.6%, demonstrating its effectiveness and feasibility.


Sign in / Sign up

Export Citation Format

Share Document