scholarly journals Research on Object Detection Model Based on Feature Network Optimization

Processes ◽  
2021 ◽  
Vol 9 (9) ◽  
pp. 1654
Author(s):  
Xiaoliang Zhang ◽  
Kehe Wu ◽  
Qi Ma ◽  
Zuge Chen

As the object detection dataset scale is smaller than the image recognition dataset ImageNet scale, transfer learning has become a basic training method for deep learning object detection models, which pre-trains the backbone network of the object detection model on an ImageNet dataset to extract features for detection tasks. However, the classification task of detection focuses on the salient region features of an object, while the location task of detection focuses on the edge features, so there is a certain deviation between the features extracted by a pretrained backbone network and those needed by a localization task. To solve this problem, a decoupled self-attention (DSA) module is proposed for one-stage object-detection models in this paper. A DSA includes two decoupled self-attention branches, so it can extract appropriate features for different tasks. It is located between the Feature Pyramid Networks (FPN) and head networks of subtasks, and used to independently extract global features for different tasks based on FPN-fused features. Although the DSA network module is simple, it can effectively improve the performance of object detection, and can easily be embedded in many detection models. Our experiments are based on the representative one-stage detection model RetinaNet. In the Common Objects in Context (COCO) dataset, when ResNet50 and ResNet101 are used as backbone networks, the detection performances can be increased by 0.4 and 0.5% AP, respectively. When the DSA module and object confidence task are both applied in RetinaNet, the detection performances based on ResNet50 and ResNet101 can be increased by 1.0 and 1.4% AP, respectively. The experiment results show the effectiveness of the DSA module.

Sensors ◽  
2018 ◽  
Vol 18 (10) ◽  
pp. 3341 ◽  
Author(s):  
Hilal Tayara ◽  
Kil Chong

Object detection in very high-resolution (VHR) aerial images is an essential step for a wide range of applications such as military applications, urban planning, and environmental management. Still, it is a challenging task due to the different scales and appearances of the objects. On the other hand, object detection task in VHR aerial images has improved remarkably in recent years due to the achieved advances in convolution neural networks (CNN). Most of the proposed methods depend on a two-stage approach, namely: a region proposal stage and a classification stage such as Faster R-CNN. Even though two-stage approaches outperform the traditional methods, their optimization is not easy and they are not suitable for real-time applications. In this paper, a uniform one-stage model for object detection in VHR aerial images has been proposed. In order to tackle the challenge of different scales, a densely connected feature pyramid network has been proposed by which high-level multi-scale semantic feature maps with high-quality information are prepared for object detection. This work has been evaluated on two publicly available datasets and outperformed the current state-of-the-art results on both in terms of mean average precision (mAP) and computation time.


2019 ◽  
Vol 11 (16) ◽  
pp. 1921 ◽  
Author(s):  
Zijun Duo ◽  
Wenke Wang ◽  
Huizan Wang

Oceanic mesoscale eddies greatly influence energy and matter transport and acoustic propagation. However, the traditional detection method for oceanic mesoscale eddies relies too much on the threshold value and has significant subjectivity. The existing machine learning methods are not mature or purposeful enough, as their train set lacks authority. In view of the above problems, this paper constructs a mesoscale eddy automatic identification and positioning network—OEDNet—based on an object detection network. Firstly, 2D image processing technology is used to enhance the data of a small number of accurate eddy samples annotated by marine experts to generate the train set. Then, the object detection model with a deep residual network, and a feature pyramid network as the main structure, is designed and optimized for small samples and complex regions in the mesoscale eddies of the ocean. Experimental results show that the model achieves better recognition compared to the traditional detection method and exhibits a good generalization ability in different sea areas.


2021 ◽  
Vol 13 (2) ◽  
pp. 160
Author(s):  
Jiangqiao Yan ◽  
Liangjin Zhao ◽  
Wenhui Diao ◽  
Hongqi Wang ◽  
Xian Sun

As a precursor step for computer vision algorithms, object detection plays an important role in various practical application scenarios. With the objects to be detected becoming more complex, the problem of multi-scale object detection has attracted more and more attention, especially in the field of remote sensing detection. Early convolutional neural network detection algorithms are mostly based on artificially preset anchor-boxes to divide different regions in the image, and then obtain the prior position of the target. However, the anchor box is difficult to set reasonably and will cause a large amount of computational redundancy, which affects the generality of the detection model obtained under fixed parameters. In the past two years, anchor-free detection algorithm has achieved remarkable development in the field of detection on natural image. However, there is no sufficient research on how to deal with multi-scale detection more effectively in anchor-free framework and use these detectors on remote sensing images. In this paper, we propose a specific-attention Feature Pyramid Network (FPN) module, which is able to generate a feature pyramid, basing on the characteristics of objects with various sizes. In addition, this pyramid suits multi-scale object detection better. Besides, a scale-aware detection head is proposed which contains a multi-receptive feature fusion module and a size-based feature compensation module. The new anchor-free detector can obtain a more effective multi-scale feature expression. Experiments on challenging datasets show that our approach performs favorably against other methods in terms of the multi-scale object detection performance.


Author(s):  
Zhishan Li ◽  
Yiran Sun ◽  
Guanzhong Tian ◽  
Lei Xie ◽  
Yong Liu ◽  
...  

2021 ◽  
Vol 13 (22) ◽  
pp. 4610
Author(s):  
Li Zhu ◽  
Zihao Xie ◽  
Jing Luo ◽  
Yuhang Qi ◽  
Liman Liu ◽  
...  

Current object detection algorithms perform inference on all samples at a fixed computational cost in the inference stage, which wastes computing resources and is not flexible. To solve this problem, a dynamic object detection algorithm based on a lightweight shared feature pyramid is proposed, which performs adaptive inference according to computing resources and the difficulty of samples, greatly improving the efficiency of inference. Specifically, a lightweight shared feature pyramid network and lightweight detection head is proposed to reduce the amount of computation and parameters in the feature fusion part and detection head of the dynamic object detection model. On the PASCAL VOC dataset, under the two conditions of “anytime prediction” and “budgeted batch object detection”, the performance, computation amount and parameter amount are better than the dynamic object detection models constructed by networks such as ResNet, DenseNet and MSDNet.


2021 ◽  
Vol 14 (1) ◽  
pp. 45
Author(s):  
Subrahmanyam Vaddi ◽  
Dongyoun Kim ◽  
Chandan Kumar ◽  
Shafqat Shad ◽  
Ali Jannesari

Unmanned Aerial Vehicles (UAVs) equipped with vision capabilities have become popular in recent years. Many applications have especially been employed object detection techniques extracted from the information captured by an onboard camera. However, object detection on UAVs requires high performance, which has a negative effect on the result. In this article, we propose a deep feature pyramid architecture with a modified focal loss function, which enables it to reduce the class imbalance. Moreover, the proposed method employed an end to end object detection model running on the UAV platform for real-time application. To evaluate the proposed architecture, we combined our model with Resnet and MobileNet as a backend network, and we compared it with RetinaNet and HAL-RetinaNet. Our model produced a performance of 30.6 mAP with an inference time of 14 fps. This result shows that our proposed model outperformed RetinaNet by 6.2 mAP.


Author(s):  
Zhishan Li ◽  
Yiran Sun ◽  
Guanzhong Tian ◽  
Lei Xie ◽  
Yong Liu ◽  
...  

2019 ◽  
Vol 9 (16) ◽  
pp. 3225 ◽  
Author(s):  
He ◽  
Huang ◽  
Wei ◽  
Li ◽  
Guo

In recent years, significant advances have been gained in visual detection, and an abundance of outstanding models have been proposed. However, state-of-the-art object detection networks have some inefficiencies in detecting small targets. They commonly fail to run on portable devices or embedded systems due to their high complexity. In this workpaper, a real-time object detection model, termed as Tiny Fast You Only Look Once (TF-YOLO), is developed to implement in an embedded system. Firstly, the k-means++ algorithm is applied to cluster the dataset, which contributes to more excellent priori boxes of the targets. Secondly, inspired by the multi-scale prediction idea in the Feature Pyramid Networks (FPN) algorithm, the framework in YOLOv3 is effectively improved and optimized, by three scales to detect the earlier extracted features. In this way, the modified network is sensitive for small targets. Experimental results demonstrate that the proposed TF-YOLO method is a smaller, faster and more efficient network model increasing the performance of end-to-end training and real-time object detection for a variety of devices.


Sensors ◽  
2020 ◽  
Vol 20 (22) ◽  
pp. 6485
Author(s):  
Delia-Georgiana Stuparu ◽  
Radu-Ioan Ciobanu ◽  
Ciprian Dobre

In order to improve the traffic in large cities and to avoid congestion, advanced methods of detecting and predicting vehicle behaviour are needed. Such methods require complex information regarding the number of vehicles on the roads, their positions, directions, etc. One way to obtain this information is by analyzing overhead images collected by satellites or drones, and extracting information from them through intelligent machine learning models. Thus, in this paper we propose and present a one-stage object detection model for finding vehicles in satellite images using the RetinaNet architecture and the Cars Overhead With Context dataset. By analyzing the results obtained by the proposed model, we show that it has a very good vehicle detection accuracy and a very low detection time, which shows that it can be employed to successfully extract data from real-time satellite or drone data.


Sign in / Sign up

Export Citation Format

Share Document