A survey and performance evaluation of deep learning methods for small object detection

2021 ◽  
Vol 172 ◽  
pp. 114602 ◽  
Author(s):  
Yang Liu ◽  
Peng Sun ◽  
Nickolas Wergeles ◽  
Yi Shang
2020 ◽  
Author(s):  
◽  
Yang Liu

[ACCESS RESTRICTED TO THE UNIVERSITY OF MISSOURI AT REQUEST OF AUTHOR.] With the rapid development of deep learning in computer vision, especially deep convolutional neural networks (CNNs), significant advances have been made in recent years on object recognition and detection in images. Highly accurate detection results have been achieved for large objects, whereas detection accuracy on small objects remains to be low. This dissertation focuses on investigating deep learning methods for small object detection in images and proposing new methods with improved performance. First, we conducted a comprehensive review of existing deep learning methods for small object detections, in which we summarized and categorized major techniques and models, identified major challenges, and listed some future research directions. Existing techniques were categorized into using contextual information, combining multiple feature maps, creating sufficient positive examples, and balancing foreground and background examples. Methods developed in four related areas, generic object detection, face detection, object detection in aerial imagery, and segmentation, were summarized and compared. In addition, the performances of several leading deep learning methods for small object detection, including YOLOv3, Faster R-CNN, and SSD, were evaluated based on three large benchmark image datasets of small objects. Experimental results showed that Faster R-CNN performed the best, while YOLOv3 was a close second. Furthermore, a new deep learning method, called Retina-context Net, was proposed and outperformed state-of-the art one-stage deep learning models, including SSD, YOLOv3 and RetinaNet, on the COCO and SUN benchmark datasets. Secondly, we created a new dataset for bird detection, called Little Birds in Aerial Imagery (LBAI), from real-life aerial imagery. LBAI contains birds with sizes ranging from 10 by 10 pixels to 40 by 40 pixels. We adapted and applied several state-of-the-art deep learning models to LBAI, including object detection models such as YOLOv2, SSH, and Tiny Face, and instance segmentation models such as U-Net and Mask R-CNN. Our empirical results illustrated the strength and weakness of these methods, showing that SSH performed the best for easy cases, whereas Tiny Face performed the best for hard cases with cluttered backgrounds. Among small instance segmentation methods, U-Net achieved slightly better performance than Mask R-CNN. Thirdly, we proposed a new graph neural network-based object detection algorithm, called GODM, to take the spatial information of candidate objects into consideration in small object detection. Instead of detecting small objects independently as the existing deep learning methods do, GODM treats the candidate bounding boxes generated by existing object detectors as nodes and creates edges based on the spatial or semantic relationship between the candidate bounding boxes. GODM contains four major components: node feature generation, graph generation, node class labelling, and graph convolutional neural network model. Several graph generation methods were proposed. Experimental results on the LBDA dataset show that GODM outperformed existing state-of-the-art object detector Faster R-CNN significantly, up to 12% better in accuracy. Finally, we proposed a new computer vision-based grass analysis using machine learning. To deal with the variation of lighting condition, a two-stage segmentation strategy is proposed for grass coverage computation based on a blackboard background. On a real world dataset we collected from natural environments, the proposed method was robust to varying environments, lighting, and colors. For grass detection and coverage computation, the error rate was just 3%.


2021 ◽  
Author(s):  
Zhang Zhenghua ◽  
Jiang Ling ◽  
Hong Qingqing

2019 ◽  
Vol 26 (6) ◽  
pp. 597-606 ◽  
Author(s):  
Lu Yan ◽  
Masahiro Yamaguchi ◽  
Naoki Noro ◽  
Yohei Takara ◽  
Fuminori Ando

2020 ◽  
Vol 2020 ◽  
pp. 1-18 ◽  
Author(s):  
Nhat-Duy Nguyen ◽  
Tien Do ◽  
Thanh Duc Ngo ◽  
Duy-Dinh Le

Small object detection is an interesting topic in computer vision. With the rapid development in deep learning, it has drawn attention of several researchers with innovations in approaches to join a race. These innovations proposed comprise region proposals, divided grid cell, multiscale feature maps, and new loss function. As a result, performance of object detection has recently had significant improvements. However, most of the state-of-the-art detectors, both in one-stage and two-stage approaches, have struggled with detecting small objects. In this study, we evaluate current state-of-the-art models based on deep learning in both approaches such as Fast RCNN, Faster RCNN, RetinaNet, and YOLOv3. We provide a profound assessment of the advantages and limitations of models. Specifically, we run models with different backbones on different datasets with multiscale objects to find out what types of objects are suitable for each model along with backbones. Extensive empirical evaluation was conducted on 2 standard datasets, namely, a small object dataset and a filtered dataset from PASCAL VOC 2007. Finally, comparative results and analyses are then presented.


Author(s):  
Seokyong Shin ◽  
Hyunho Han ◽  
Sang Hun Lee

YOLOv3 is a deep learning-based real-time object detector and is mainly used in applications such as video surveillance and autonomous vehicles. In this paper, we proposed an improved YOLOv3 (You Only Look Once version 3) applied Duplex FPN, which enhanced large object detection by utilizing low-level feature information. The conventional YOLOv3 improved the small object detection performance by applying FPN (Feature Pyramid Networks) structure to YOLOv2. However, YOLOv3 with an FPN structure specialized in detecting small objects, so it is difficult to detect large objects. Therefore, this paper proposed an improved YOLOv3 applied Duplex FPN, which can utilize low-level location information in high-level feature maps instead of the existing FPN structure of YOLOv3. This improved the detection accuracy of large objects. Also, an extra detection layer was added to the top-level feature map to prevent failure of detection of parts of large objects. Further, dimension clusters of each detection layer were reassigned to learn quickly how to accurately detect objects. The proposed method was compared and analyzed in the PASCAL VOC dataset. The experimental results showed that the bounding box accuracy of large objects improved owing to the Duplex FPN and extra detection layer, and the proposed method succeeded in detecting large objects that the existing YOLOv3 did not.


Sensors ◽  
2021 ◽  
Vol 21 (5) ◽  
pp. 1926
Author(s):  
Kai Yin ◽  
Juncheng Jia ◽  
Xing Gao ◽  
Tianrui Sun ◽  
Zhengyin Zhou

A series of sky surveys were launched in search of supernovae and generated a tremendous amount of data, which pushed astronomy into a new era of big data. However, it can be a disastrous burden to manually identify and report supernovae, because such data have huge quantity and sparse positives. While the traditional machine learning methods can be used to deal with such data, deep learning methods such as Convolutional Neural Networks demonstrate more powerful adaptability in this area. However, most data in the existing works are either simulated or without generality. How do the state-of-the-art object detection algorithms work on real supernova data is largely unknown, which greatly hinders the development of this field. Furthermore, the existing works of supernovae classification usually assume the input images are properly cropped with a single candidate located in the center, which is not true for our dataset. Besides, the performance of existing detection algorithms can still be improved for the supernovae detection task. To address these problems, we collected and organized all the known objectives of the Panoramic Survey Telescope and Rapid Response System (Pan-STARRS) and the Popular Supernova Project (PSP), resulting in two datasets, and then compared several detection algorithms on them. After that, the selected Fully Convolutional One-Stage (FCOS) method is used as the baseline and further improved with data augmentation, attention mechanism, and small object detection technique. Extensive experiments demonstrate the great performance enhancement of our detection algorithm with the new datasets.


Author(s):  
Jun Ho Choi ◽  
Tae Young Han ◽  
Seung Hyun Lee ◽  
Byung Cheol Song

Sign in / Sign up

Export Citation Format

Share Document