Small Object Detection Algorithm Based on Feature Pyramid-Enhanced Fusion SSD

In order to improve the detection rate of the traditional single-shot multibox detection algorithm in small object detection, a feature-enhanced fusion SSD object detection algorithm based on the pyramid network is proposed. Firstly, the selected multiscale feature layer is merged with the scale-invariant convolutional layer through the feature pyramid network structure; at the same time, the multiscale feature map is separately converted into the channel number using the scale-invariant convolution kernel. Then, the obtained two sets of pyramid-shaped feature layers are further feature fused to generate a set of enhanced multiscale feature maps, and the scale-invariant convolution is performed again on these layers. Finally, the obtained layer is used for detection and localization. The final location coordinates and confidence are output after nonmaximum suppression. Experimental results on the Pascal VOC 2007 and 2012 datasets confirm that there is a 8.2% improvement in mAP compared to the original SSD and some existing algorithms.

Download Full-text

FASSD: A Feature Fusion and Spatial Attention-Based Single Shot Detector for Small Object Detection

Electronics ◽

10.3390/electronics9091536 ◽

2020 ◽

Vol 9 (9) ◽

pp. 1536

Author(s):

Deng Jiang ◽

Bei Sun ◽

Shaojing Su ◽

Zhen Zuo ◽

Peng Wu ◽

...

Keyword(s):

Object Detection ◽

Spatial Attention ◽

Feature Fusion ◽

Single Shot ◽

Small Object ◽

Feature Maps ◽

Feature Representations ◽

Small Object Detection ◽

High Level ◽

Detection Speed

Deep learning methods have significantly improved object detection performance, but small object detection remains an extremely difficult and challenging task in computer vision. We propose a feature fusion and spatial attention-based single shot detector (FASSD) for small object detection. We fuse high-level semantic information into shallow layers to generate discriminative feature representations for small objects. To adaptively enhance the expression of small object areas and suppress the feature response of background regions, the spatial attention block learns a self-attention mask to enhance the original feature maps. We also establish a small object dataset (LAKE-BOAT) of a scene with a boat on a lake and tested our algorithm to evaluate its performance. The results show that our FASSD achieves 79.3% mAP (mean average precision) on the PASCAL VOC2007 test with input 300 × 300, which outperforms the original single shot multibox detector (SSD) by 1.6 points, as well as most improved algorithms based on SSD. The corresponding detection speed was 45.3 FPS (frame per second) on the VOC2007 test using a single NVIDIA TITAN RTX GPU. The test results of a simplified FASSD on the LAKE-BOAT dataset indicate that our model achieved an improvement of 3.5% mAP on the baseline network while maintaining a real-time detection speed (64.4 FPS).

Download Full-text

Feature pyramid of bi-directional stepped concatenation for small object detection

Multimedia Tools and Applications ◽

10.1007/s11042-021-10718-1 ◽

2021 ◽

Author(s):

Qiyuan Zheng ◽

Ying Chen

Keyword(s):

Object Detection ◽

Small Object ◽

Feature Pyramid ◽

Small Object Detection

Download Full-text

Infrared Dim-small Object Detection Algorithm Based on Saliency Map Combined with Target Motion Feature

2020 IEEE International Conference on Progress in Informatics and Computing (PIC) ◽

10.1109/pic50277.2020.9350820 ◽

2020 ◽

Author(s):

WenWen Zhang ◽

ZhiChao Lian

Keyword(s):

Object Detection ◽

Detection Algorithm ◽

Saliency Map ◽

Target Motion ◽

Small Object ◽

Motion Feature ◽

Small Object Detection

Download Full-text

GC-YOLOv3: You Only Look Once with Global Context Block

Electronics ◽

10.3390/electronics9081235 ◽

2020 ◽

Vol 9 (8) ◽

pp. 1235

Author(s):

Yang Yang ◽

Hongmin Deng

Keyword(s):

Object Detection ◽

Irrelevant Information ◽

Detection Algorithm ◽

Visual Object ◽

Detection Accuracy ◽

Feature Maps ◽

Average Precision ◽

Global Context ◽

Pascal Voc ◽

Feature Pyramid

In order to make the classification and regression of single-stage detectors more accurate, an object detection algorithm named Global Context You-Only-Look-Once v3 (GC-YOLOv3) is proposed based on the You-Only-Look-Once (YOLO) in this paper. Firstly, a better cascading model with learnable semantic fusion between a feature extraction network and a feature pyramid network is designed to improve detection accuracy using a global context block. Secondly, the information to be retained is screened by combining three different scaling feature maps together. Finally, a global self-attention mechanism is used to highlight the useful information of feature maps while suppressing irrelevant information. Experiments show that our GC-YOLOv3 reaches a maximum of 55.5 object detection mean Average Precision (mAP)@0.5 on Common Objects in Context (COCO) 2017 test-dev and that the mAP is 5.1% higher than that of the YOLOv3 algorithm on Pascal Visual Object Classes (PASCAL VOC) 2007 test set. Therefore, experiments indicate that the proposed GC-YOLOv3 model exhibits optimal performance on the PASCAL VOC and COCO datasets.

Download Full-text

Water surface object detection using panoramic vision based on improved single-shot multibox detector

EURASIP Journal on Advances in Signal Processing ◽

10.1186/s13634-021-00831-6 ◽

2021 ◽

Vol 2021 (1) ◽

Author(s):

Aofeng Li ◽

Xufang Zhu ◽

Shuo He ◽

Jiawei Xia

Keyword(s):

Object Detection ◽

Water Surface ◽

Detection Algorithm ◽

Single Shot ◽

Panoramic Vision ◽

Feature Pyramid ◽

The Mean ◽

Detection Effect ◽

Surface Object ◽

Improved Algorithm

AbstractIn view of the deficiencies in traditional visual water surface object detection, such as the existence of non-detection zones, failure to acquire global information, and deficiencies in a single-shot multibox detector (SSD) object detection algorithm such as remote detection and low detection precision of small objects, this study proposes a water surface object detection algorithm from panoramic vision based on an improved SSD. We reconstruct the backbone network for the SSD algorithm, replace VVG16 with a ResNet-50 network, and add five layers of feature extraction. More abundant semantic information of the shallow feature graph is obtained through a feature pyramid network structure with deconvolution. An experiment is conducted by building a water surface object dataset. Results showed the mean Average Precision (mAP) of the improved algorithm are increased by 4.03%, compared with the existing SSD detecting Algorithm. Improved algorithm can effectively improve the overall detection precision of water surface objects and enhance the detection effect of remote objects.

Download Full-text

An Evaluation of Deep Learning Methods for Small Object Detection

Journal of Electrical and Computer Engineering ◽

10.1155/2020/3189691 ◽

2020 ◽

Vol 2020 ◽

pp. 1-18 ◽

Cited By ~ 2

Author(s):

Nhat-Duy Nguyen ◽

Tien Do ◽

Thanh Duc Ngo ◽

Duy-Dinh Le

Keyword(s):

Deep Learning ◽

Object Detection ◽

State Of The Art ◽

Rapid Development ◽

Empirical Evaluation ◽

Grid Cell ◽

Small Object ◽

Feature Maps ◽

Comparative Results ◽

Small Object Detection

Small object detection is an interesting topic in computer vision. With the rapid development in deep learning, it has drawn attention of several researchers with innovations in approaches to join a race. These innovations proposed comprise region proposals, divided grid cell, multiscale feature maps, and new loss function. As a result, performance of object detection has recently had significant improvements. However, most of the state-of-the-art detectors, both in one-stage and two-stage approaches, have struggled with detecting small objects. In this study, we evaluate current state-of-the-art models based on deep learning in both approaches such as Fast RCNN, Faster RCNN, RetinaNet, and YOLOv3. We provide a profound assessment of the advantages and limitations of models. Specifically, we run models with different backbones on different datasets with multiscale objects to find out what types of objects are suitable for each model along with backbones. Extensive empirical evaluation was conducted on 2 standard datasets, namely, a small object dataset and a filtered dataset from PASCAL VOC 2007. Finally, comparative results and analyses are then presented.

Download Full-text

Fast Small Object Detection Algorithm Based on Feature Enhancement and Reconstruction

10.1109/wcsp52459.2021.9613660 ◽

2021 ◽

Author(s):

Zhiyong Huo ◽

Tianwen Yan ◽

Weiye Cao

Keyword(s):

Object Detection ◽

Detection Algorithm ◽

Small Object ◽

Feature Enhancement ◽

Small Object Detection

Download Full-text

Small object detection combining attention mechanism and a novel FPN

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-211905 ◽

2021 ◽

pp. 1-13

Author(s):

Junying Chen ◽

Shipeng Liu ◽

Liang Zhao ◽

Dengfeng Chen ◽

Weihua Zhang

Keyword(s):

Object Detection ◽

Small Object ◽

Extraction Ability ◽

Feature Pyramid ◽

Face Datasets ◽

New Feature ◽

Small Object Detection ◽

Low Sensitivity ◽

Objects Detection ◽

Transmission Ability

Since small objects occupy less pixels in the image and are difficult to recognize. Small object detection has always been a research difficulty in the field of computer vision. Aiming at the problems of low sensitivity and poor detection performance of YOLOv3 for small objects. AFYOLO, which is more sensitive to small objects detection was proposed in this paper. Firstly, the DenseNet module is introduced into the low-level layers of backbone to enhance the transmission ability of objects information. At the same time, a new mechanism combining channel attention and spatial attention is introduced to improve the feature extraction ability of the backbone. Secondly, a new feature pyramid network (FPN) is proposed to better obtain the features of small objects. Finally, ablation studies on ImageNet classification task and MS-COCO object detection task verify the effectiveness of the proposed attention module and FPN. The results on Wider Face datasets show that the AP of the proposed method is 11.89%higher than that of YOLOv3 and 8.59%higher than that of YOLOv4. All of results show that AFYOLO has better ability for small object detection.

Download Full-text

Small Object Detection Using Deep Feature Pyramid Networks

Advances in Multimedia Information Processing – PCM 2018 - Lecture Notes in Computer Science ◽

10.1007/978-3-030-00764-5_51 ◽

2018 ◽

pp. 554-564 ◽

Cited By ~ 4

Author(s):

Zhenwen Liang ◽

Jie Shao ◽

Dongyang Zhang ◽

Lianli Gao

Keyword(s):

Object Detection ◽

Small Object ◽

Deep Feature ◽

Feature Pyramid ◽

Small Object Detection

Download Full-text

Vehicle and Vessel Detection on Satellite Imagery: A Comparative Study on Single-Shot Detectors

Remote Sensing ◽

10.3390/rs12071217 ◽

2020 ◽

Vol 12 (7) ◽

pp. 1217 ◽

Cited By ~ 1

Author(s):

Tanguy Ophoff ◽

Steven Puttemans ◽

Vasileios Kalogirou ◽

Jean-Philippe Robin ◽

Toon Goedemé

Keyword(s):

Object Detection ◽

Satellite Imagery ◽

Satellite Data ◽

Single Shot ◽

Small Object ◽

Average Precision ◽

Vessel Detection ◽

Speed Up ◽

Small Object Detection ◽

The Many

In this paper, we investigate the feasibility of automatic small object detection, such as vehicles and vessels, in satellite imagery with a spatial resolution between 0.3 and 0.5 m. The main challenges of this task are the small objects, as well as the spread in object sizes, with objects ranging from 5 to a few hundred pixels in length. We first annotated 1500 km2, making sure to have equal amounts of land and water data. On top of this dataset we trained and evaluated four different single-shot object detection networks: YOLOV2, YOLOV3, D-YOLO and YOLT, adjusting the many hyperparameters to achieve maximal accuracy. We performed various experiments to better understand the performance and differences between the models. The best performing model, D-YOLO, reached an average precision of 60% for vehicles and 66% for vessels and can process an image of around 1 Gpx in 14 s. We conclude that these models, if properly tuned, can thus indeed be used to help speed up the workflows of satellite data analysts and to create even bigger datasets, making it possible to train even better models in the future.

Download Full-text