scholarly journals Generative and self-supervised domain adaptation for one-stage object detection

Array ◽  
2021 ◽  
pp. 100071
Author(s):  
Kazuma Fujii ◽  
Kazuhiko Kawamoto
2020 ◽  
Vol 12 (15) ◽  
pp. 2501 ◽  
Author(s):  
Minh-Tan Pham ◽  
Luc Courtrai ◽  
Chloé Friguet ◽  
Sébastien Lefèvre ◽  
Alexandre Baussard

Object detection from aerial and satellite remote sensing images has been an active research topic over the past decade. Thanks to the increase in computational resources and data availability, deep learning-based object detection methods have achieved numerous successes in computer vision, and more recently in remote sensing. However, the ability of current detectors to deal with (very) small objects still remains limited. In particular, the fast detection of small objects from a large observed scene is still an open question. In this work, we address this challenge and introduce an enhanced one-stage deep learning-based detection model, called You Only Look Once (YOLO)-fine, which is based on the structure of YOLOv3. Our detector is designed to be capable of detecting small objects with high accuracy and high speed, allowing further real-time applications within operational contexts. We also investigate its robustness to the appearance of new backgrounds in the validation set, thus tackling the issue of domain adaptation that is critical in remote sensing. Experimental studies that were conducted on both aerial and satellite benchmark datasets show some significant improvement of YOLO-fine as compared to other state-of-the art object detectors.


2020 ◽  
Vol 13 (1) ◽  
pp. 23
Author(s):  
Wei Zhao ◽  
William Yamada ◽  
Tianxin Li ◽  
Matthew Digman ◽  
Troy Runge

In recent years, precision agriculture has been researched to increase crop production with less inputs, as a promising means to meet the growing demand of agriculture products. Computer vision-based crop detection with unmanned aerial vehicle (UAV)-acquired images is a critical tool for precision agriculture. However, object detection using deep learning algorithms rely on a significant amount of manually prelabeled training datasets as ground truths. Field object detection, such as bales, is especially difficult because of (1) long-period image acquisitions under different illumination conditions and seasons; (2) limited existing prelabeled data; and (3) few pretrained models and research as references. This work increases the bale detection accuracy based on limited data collection and labeling, by building an innovative algorithms pipeline. First, an object detection model is trained using 243 images captured with good illimitation conditions in fall from the crop lands. In addition, domain adaptation (DA), a kind of transfer learning, is applied for synthesizing the training data under diverse environmental conditions with automatic labels. Finally, the object detection model is optimized with the synthesized datasets. The case study shows the proposed method improves the bale detecting performance, including the recall, mean average precision (mAP), and F measure (F1 score), from averages of 0.59, 0.7, and 0.7 (the object detection) to averages of 0.93, 0.94, and 0.89 (the object detection + DA), respectively. This approach could be easily scaled to many other crop field objects and will significantly contribute to precision agriculture.


Author(s):  
Na Dong ◽  
Yongqiang Zhang ◽  
Mingli Ding ◽  
Shibiao Xu ◽  
Yancheng Bai

Sensors ◽  
2018 ◽  
Vol 18 (10) ◽  
pp. 3341 ◽  
Author(s):  
Hilal Tayara ◽  
Kil Chong

Object detection in very high-resolution (VHR) aerial images is an essential step for a wide range of applications such as military applications, urban planning, and environmental management. Still, it is a challenging task due to the different scales and appearances of the objects. On the other hand, object detection task in VHR aerial images has improved remarkably in recent years due to the achieved advances in convolution neural networks (CNN). Most of the proposed methods depend on a two-stage approach, namely: a region proposal stage and a classification stage such as Faster R-CNN. Even though two-stage approaches outperform the traditional methods, their optimization is not easy and they are not suitable for real-time applications. In this paper, a uniform one-stage model for object detection in VHR aerial images has been proposed. In order to tackle the challenge of different scales, a densely connected feature pyramid network has been proposed by which high-level multi-scale semantic feature maps with high-quality information are prepared for object detection. This work has been evaluated on two publicly available datasets and outperformed the current state-of-the-art results on both in terms of mean average precision (mAP) and computation time.


2019 ◽  
Vol 28 (04) ◽  
pp. 1
Author(s):  
Jiyuan Jia ◽  
Li Zhou ◽  
Jie Chen

Sensors ◽  
2019 ◽  
Vol 19 (6) ◽  
pp. 1434 ◽  
Author(s):  
Minle Li ◽  
Yihua Hu ◽  
Nanxiang Zhao ◽  
Qishu Qian

Three-dimensional (3D) object detection has important applications in robotics, automatic loading, automatic driving and other scenarios. With the improvement of devices, people can collect multi-sensor/multimodal data from a variety of sensors such as Lidar and cameras. In order to make full use of various information advantages and improve the performance of object detection, we proposed a Complex-Retina network, a convolution neural network for 3D object detection based on multi-sensor data fusion. Firstly, a unified architecture with two feature extraction networks was designed, and the feature extraction of point clouds and images from different sensors realized synchronously. Then, we set a series of 3D anchors and projected them to the feature maps, which were cropped into 2D anchors with the same size and fused together. Finally, the object classification and 3D bounding box regression were carried out on the multipath of fully connected layers. The proposed network is a one-stage convolution neural network, which achieves the balance between the accuracy and speed of object detection. The experiments on KITTI datasets show that the proposed network is superior to the contrast algorithms in average precision (AP) and time consumption, which shows the effectiveness of the proposed network.


Sign in / Sign up

Export Citation Format

Share Document