scholarly journals Improving UAV Object Detection through Image Augmentation

Author(s):  
Karen Gishyan

Ground-image based object detection algorithms have had great improvements over the years and provided good results for challenging image datasets such as COCO and PASCAL VOC. These models, however, are not as successful when it comes to unmanned aerial vehicle (UAV)-based object detection and commonly performance deterioration is observed. It is due to the reason that it is a much harder task for the models to detect and classify smaller objects rather than medium-size or large-size objects, and drone imagery is prone to variances caused by different flying altitudes, weather conditions, camera angles and quality. This work explores the performance of two state-of-art-object detection algorithms on the drone object detection task and proposes image augmentation 1 procedures to improve model performance. We compose three image augmentation sequences and propose two new image augmentation techniques and further explore their different combinations on the performances of the models. The augmenters are evaluated for two deep learning models, which include model-training with high-resolution images (1056×1056 pixels) to observe their overall effectiveness. We provide a comparison of augmentation techniques across each model. We identify two augmentation procedures that increase object detection accuracy more effectively than others and obtain our best model using a transfer learning 2 approach, where the weights for the transfer are obtained from training the model with our proposed augmentation technique. At the end of the experiments, we achieve a robust model performance and accuracy, and identify the aspects of improvement as part of our future work.

Agriculture ◽  
2021 ◽  
Vol 11 (10) ◽  
pp. 1003
Author(s):  
Shenglian Lu ◽  
Zhen Song ◽  
Wenkang Chen ◽  
Tingting Qian ◽  
Yingyu Zhang ◽  
...  

The leaf is the organ that is crucial for photosynthesis and the production of nutrients in plants; as such, the number of leaves is one of the key indicators with which to describe the development and growth of a canopy. The irregular shape and distribution of the blades, as well as the effect of natural light, make the segmentation and detection process of the blades difficult. The inaccurate acquisition of plant phenotypic parameters may affect the subsequent judgment of crop growth status and crop yield. To address the challenge in counting dense and overlapped plant leaves under natural environments, we proposed an improved deep-learning-based object detection algorithm by merging a space-to-depth module, a Convolutional Block Attention Module (CBAM) and Atrous Spatial Pyramid Pooling (ASPP) into the network, and applying the smoothL1 function to improve the loss function of object prediction. We evaluated our method on images of five different plant species collected under indoor and outdoor environments. The experimental results demonstrated that our algorithm which counts dense leaves improved average detection accuracy of 85% to 96%. Our algorithm also showed better performance in both detection accuracy and time consumption compared to other state-of-the-art object detection algorithms.


Sensors ◽  
2018 ◽  
Vol 18 (10) ◽  
pp. 3415 ◽  
Author(s):  
Jinpeng Zhang ◽  
Jinming Zhang ◽  
Shan Yu

In the image object detection task, a huge number of candidate boxes are generated to match with a relatively very small amount of ground-truth boxes, and through this method the learning samples can be created. But in fact the vast majority of the candidate boxes do not contain valid object instances and should be recognized and rejected during the training and evaluation of the network. This leads to extra high computation burden and a serious imbalance problem between object and none-object samples, thereby impeding the algorithm’s performance. Here we propose a new heuristic sampling method to generate candidate boxes for two-stage detection algorithms. It is generally applicable to the current two-stage detection algorithms to improve their detection performance. Experiments on COCO dataset showed that, relative to the baseline model, this new method could significantly increase the detection accuracy and efficiency.


2021 ◽  
Author(s):  
Da-Ren Chen ◽  
Wei-Min Chiu

Abstract Machine learning techniques have been used to increase detection accuracy of cracks in road surfaces. Most studies failed to consider variable illumination conditions on the target of interest (ToI), and only focus on detecting the presence or absence of road cracks. This paper proposes a new road crack detection method, IlumiCrack, which integrates Gaussian mixture models (GMM) and object detection CNN models. This work provides the following contributions: 1) For the first time, a large-scale road crack image dataset with a range of illumination conditions (e.g., day and night) is prepared using a dashcam. 2) Based on GMM, experimental evaluations on 2 to 4 levels of brightness are conducted for optimal classification. 3) the IlumiCrack framework is used to integrate state-of-the-art object detecting methods with CNN to classify the road crack images into eight types with high accuracy. Experimental results show that IlumiCrack outperforms the state-of-the-art R-CNN object detection frameworks.


2019 ◽  
Vol 11 (1) ◽  
pp. 9 ◽  
Author(s):  
Ying Zhang ◽  
Yimin Chen ◽  
Chen Huang ◽  
Mingke Gao

In recent years, almost all of the current top-performing object detection networks use CNN (convolutional neural networks) features. State-of-the-art object detection networks depend on CNN features. In this work, we add feature fusion in the object detection network to obtain a better CNN feature, which incorporates well deep, but semantic, and shallow, but high-resolution, CNN features, thus improving the performance of a small object. Also, the attention mechanism was applied to our object detection network, AF R-CNN (attention mechanism and convolution feature fusion based object detection), to enhance the impact of significant features and weaken background interference. Our AF R-CNN is a single end to end network. We choose the pre-trained network, VGG-16, to extract CNN features. Our detection network is trained on the dataset, PASCAL VOC 2007 and 2012. Empirical evaluation of the PASCAL VOC 2007 dataset demonstrates the effectiveness and improvement of our approach. Our AF R-CNN achieves an object detection accuracy of 75.9% on PASCAL VOC 2007, six points higher than Faster R-CNN.


2005 ◽  
Vol 36 (2) ◽  
pp. 99-111 ◽  
Author(s):  
G. Schumann ◽  
G. Lauener

A trained soft artificial neural network (SANN) model was applied to the Gornera catchment (Valais Alps, Switzerland) over the melt season May to September 2001 to predict hourly discharge up to five days ahead A SANN discharge forecast for three days ahead has previously been performed on this catchment using only past discharge and past and forecast air temperature as model training inputs. In this study, present zonal snow depth was included as a model input, which was predicted for five altitudinal catchment zones using an empirical degree-day model. Hourly discharge values for up to five days ahead were reconstructed using SANN predicted daily discharge parameters along with a normalised long-term moving average model (MAHM). The efficiency criterion R2 gives a model performance of 0.927 for a 24-hour-ahead forecast and 0.824 for a 120-hour-ahead forecast. Compared to previous work, adding the snow model to the SANN model inputs considerably increases the forecast accuracy, in particular during days of progressive discharge increase and thunderstorms. The SANN model yields excellent results on days marked by stable weather conditions, with an R2 value between 0.913 and 0.995. However, the model is unable to reliably predict low frequency, high magnitude events, e.g. release of stored water from a glacial lake.


2020 ◽  
Vol 17 (2) ◽  
pp. 172988142090257
Author(s):  
Dan Xiong ◽  
Huimin Lu ◽  
Qinghua Yu ◽  
Junhao Xiao ◽  
Wei Han ◽  
...  

High tracking frame rates have been achieved based on traditional tracking methods which however would fail due to drifts of the object template or model, especially when the object disappears from the camera’s field of view. To deal with it, tracking-and-detection-combination has become more and more popular for long-term unknown object tracking, whose detector almost does not drift and can regain the disappeared object when it comes back. However, for online machine learning and multiscale object detection, expensive computing resources and time are required. So it is not a good idea to combine tracking and detection sequentially like Tracking-Learning-Detection algorithm. Inspired from parallel tracking and mapping, this article proposes a framework of parallel tracking and detection for unknown object tracking. The object tracking algorithm is split into two separate tasks—tracking and detection which can be processed in two different threads, respectively. One thread is used to deal with the tracking between consecutive frames with a high processing speed. The other thread runs online learning algorithms to construct a discriminative model for object detection. Using our proposed framework, high tracking frame rates and the ability of correcting and recovering the failed tracker can be combined effectively. Furthermore, our framework provides open interfaces to integrate state-of-the-art object tracking and detection algorithms. We carry out an evaluation of several popular tracking and detection algorithms using the proposed framework. The experimental results show that different tracking and detection algorithms can be integrated and compared effectively by our proposed framework, and robust and fast long-term object tracking can be realized.


2020 ◽  
Vol 34 (07) ◽  
pp. 12993-13000 ◽  
Author(s):  
Zhaohui Zheng ◽  
Ping Wang ◽  
Wei Liu ◽  
Jinze Li ◽  
Rongguang Ye ◽  
...  

Bounding box regression is the crucial step in object detection. In existing methods, while ℓn-norm loss is widely adopted for bounding box regression, it is not tailored to the evaluation metric, i.e., Intersection over Union (IoU). Recently, IoU loss and generalized IoU (GIoU) loss have been proposed to benefit the IoU metric, but still suffer from the problems of slow convergence and inaccurate regression. In this paper, we propose a Distance-IoU (DIoU) loss by incorporating the normalized distance between the predicted box and the target box, which converges much faster in training than IoU and GIoU losses. Furthermore, this paper summarizes three geometric factors in bounding box regression, i.e., overlap area, central point distance and aspect ratio, based on which a Complete IoU (CIoU) loss is proposed, thereby leading to faster convergence and better performance. By incorporating DIoU and CIoU losses into state-of-the-art object detection algorithms, e.g., YOLO v3, SSD and Faster R-CNN, we achieve notable performance gains in terms of not only IoU metric but also GIoU metric. Moreover, DIoU can be easily adopted into non-maximum suppression (NMS) to act as the criterion, further boosting performance improvement. The source code and trained models are available at https://github.com/Zzh-tju/DIoU.


2019 ◽  
Vol 11 (3) ◽  
pp. 339 ◽  
Author(s):  
Chaoyue Chen ◽  
Weiguo Gong ◽  
Yongliang Chen ◽  
Weihong Li

Object detection has attracted increasing attention in the field of remote sensing image analysis. Complex backgrounds, vertical views, and variations in target kind and size in remote sensing images make object detection a challenging task. In this work, considering that the types of objects are often closely related to the scene in which they are located, we propose a convolutional neural network (CNN) by combining scene-contextual information for object detection. Specifically, we put forward the scene-contextual feature pyramid network (SCFPN), which aims to strengthen the relationship between the target and the scene and solve problems resulting from variations in target size. Additionally, to improve the capability of feature extraction, the network is constructed by repeating a building aggregated residual block. This block increases the receptive field, which can extract richer information for targets and achieve excellent performance with respect to small object detection. Moreover, to improve the proposed model performance, we use group normalization, which divides the channels into groups and computes the mean and variance for normalization within each group, to solve the limitation of the batch normalization. The proposed method is validated on a public and challenging dataset. The experimental results demonstrate that our proposed method outperforms other state-of-the-art object detection models.


Sensors ◽  
2021 ◽  
Vol 21 (5) ◽  
pp. 1926
Author(s):  
Kai Yin ◽  
Juncheng Jia ◽  
Xing Gao ◽  
Tianrui Sun ◽  
Zhengyin Zhou

A series of sky surveys were launched in search of supernovae and generated a tremendous amount of data, which pushed astronomy into a new era of big data. However, it can be a disastrous burden to manually identify and report supernovae, because such data have huge quantity and sparse positives. While the traditional machine learning methods can be used to deal with such data, deep learning methods such as Convolutional Neural Networks demonstrate more powerful adaptability in this area. However, most data in the existing works are either simulated or without generality. How do the state-of-the-art object detection algorithms work on real supernova data is largely unknown, which greatly hinders the development of this field. Furthermore, the existing works of supernovae classification usually assume the input images are properly cropped with a single candidate located in the center, which is not true for our dataset. Besides, the performance of existing detection algorithms can still be improved for the supernovae detection task. To address these problems, we collected and organized all the known objectives of the Panoramic Survey Telescope and Rapid Response System (Pan-STARRS) and the Popular Supernova Project (PSP), resulting in two datasets, and then compared several detection algorithms on them. After that, the selected Fully Convolutional One-Stage (FCOS) method is used as the baseline and further improved with data augmentation, attention mechanism, and small object detection technique. Extensive experiments demonstrate the great performance enhancement of our detection algorithm with the new datasets.


Sensors ◽  
2021 ◽  
Vol 21 (10) ◽  
pp. 3374
Author(s):  
Hansen Liu ◽  
Kuangang Fan ◽  
Qinghua Ouyang ◽  
Na Li

To address the threat of drones intruding into high-security areas, the real-time detection of drones is urgently required to protect these areas. There are two main difficulties in real-time detection of drones. One of them is that the drones move quickly, which leads to requiring faster detectors. Another problem is that small drones are difficult to detect. In this paper, firstly, we achieve high detection accuracy by evaluating three state-of-the-art object detection methods: RetinaNet, FCOS, YOLOv3 and YOLOv4. Then, to address the first problem, we prune the convolutional channel and shortcut layer of YOLOv4 to develop thinner and shallower models. Furthermore, to improve the accuracy of small drone detection, we implement a special augmentation for small object detection by copying and pasting small drones. Experimental results verify that compared to YOLOv4, our pruned-YOLOv4 model, with 0.8 channel prune rate and 24 layers prune, achieves 90.5% mAP and its processing speed is increased by 60.4%. Additionally, after small object augmentation, the precision and recall of the pruned-YOLOv4 almost increases by 22.8% and 12.7%, respectively. Experiment results verify that our pruned-YOLOv4 is an effective and accurate approach for drone detection.


Sign in / Sign up

Export Citation Format

Share Document