scholarly journals Anchor free object detection with mask attention

2020 ◽  
Author(s):  
HE Yang ◽  
Beibei Fan ◽  
Ling ling Guo

Abstract The anchor-free method based on key point detection has made great progress. However, the anchor-free method is too dependent on using a convolutional network to generate a rough heat map. This is difficult to detect for objects with a large size variation and dense and overlapping objects. To solve this problem, first, we propose a mask attention mechanism for object detection methods. And make full use of the advantages of the attention mechanism to improve the accuracy of network detection heat map generation. Then, we designed an optimized fire model to reduce the size of the model. The fire model is an extension of grouped convolution. The fire model allows each group of convolutional network features to learn the same feature through purposeful grouping. In this paper, the mask attention mechanism uses object segmentation images to guide the generation of corner heat maps. Our approach achieved an accuracy of 91.84% and a recall 89.83% in the Tencent-100K dataset. Compared with the popular object detection methods, the proposed method has advantages in model size and accuracy.

2020 ◽  
Author(s):  
Beibei Fan ◽  
HE Yang ◽  
Ling ling Guo

Abstract The anchor-free method based on key point detection has made great progress. However, the anchor-free method is too dependent on using a convolutional network to generate a rough heat map. It is difficult for the network to detect objects whose shape changes greatly, and small objects are difficult to detect. In order to solve this problem, first of all, we use the most advanced mask attention mechanism algorithm in the network to increase the accuracy of the thermodynamic generation of network detection. Then, We also designed an optimized fire model to reduce the size of the model. The masking mechanism optimizes the feature map of the network to enhance the detection capability in network space and improve the accuracy of heatmaps generation. Our approach achieved an accuracy of 91.84% and a recall 89.83% in the Tencent-100K dataset. Our approach is also competitive with the most advanced approach.


Object detection (OD) within a video is one of the relevant and critical research areas in the computer vision field. Due to the widespread of Artificial Intelligence, the basic principle in real life nowadays and its exponential growth predicted in the epochs to come, it will transmute the public. Object Detection has been extensively implemented in several areas, including human-machine Interaction, autonomous vehicles, security with video surveillance, and various fields that will be mentioned further. However, this augmentation of OD tackles different challenges such as occlusion, illumination variation, object motion, without ignoring the real-time aspect that can be quite problematic. This paper also includes some methods of application to take into account these issues. These techniques are divided into five subcategories: Point Detection, segmentation, supervised classifier, optical flow, a background modeling. This survey decorticates various methods and techniques used in object detection, as well as application domains and the problems faced. Our study discusses the cruciality of deep learning algorithms and their efficiency on future improvement in object detection topics within video sequences.


Author(s):  
Z. Tian ◽  
W. Wang ◽  
B. Tian ◽  
R. Zhan ◽  
J. Zhang

Abstract. Nowadays, deep-learning-based object detection methods are more and more broadly applied to the interpretation of optical remote sensing image. Although these methods can obtain promising results in general conditions, the designed networks usually ignore the characteristics of remote sensing images, such as large image resolution and uneven distribution of object location. In this paper, an effective detection method based on the convolutional neural network is proposed. First, in order to make the designed network more suitable for the image resolution, EfficientNet is incorporated into the detection framework as the backbone network. EfficientNet employs the compound scaling method to adjust the depth and width of the network, thereby meeting the needs of different resolutions of input images. Then, the attention mechanism is introduced into the proposed method to improve the extracted feature maps. The attention mechanism makes the network more focused on the object areas while reducing the influence of the background areas, so as to reduce the influence of uneven distribution. Comprehensive evaluations on a public object detection dataset demonstrate the effectiveness of the proposed method.


Author(s):  
Prof. Pradnya Kasture ◽  
Aishwarya Kumkar ◽  
Yash Jagtap ◽  
Akshay Tangade ◽  
Aditya Pole

Vision is one in every of the foremost necessary human senses and it plays a really necessary role in human interaction with the surrounding objects. Until now many papers have been published on these topics that shows various different computer vision products and services by developing new electronic devices for the visually disabled people. The aim is to study different object detection methods. As compared to other Object detection methods, YOLO method has multiple advantages. In alternative algorithms like CNN, Fast-CNN the algorithmic program won't investigate the image fully however in YOLO the algorithmic program investigate the image fully by predicting the bounding boxes by making use of convolutional network and possibilities for these boxes and detects the image quicker as compared to alternative algorithms.


2020 ◽  
Vol 12 (15) ◽  
pp. 2416 ◽  
Author(s):  
Zhuangzhuang Tian ◽  
Ronghui Zhan ◽  
Jiemin Hu ◽  
Wei Wang ◽  
Zhiqiang He ◽  
...  

Nowadays, object detection methods based on deep learning are applied more and more to the interpretation of optical remote sensing images. However, the complex background and the wide range of object sizes in remote sensing images increase the difficulty of object detection. In this paper, we improve the detection performance by combining the attention information, and generate adaptive anchor boxes based on the attention map. Specifically, the attention mechanism is introduced into the proposed method to enhance the features of the object regions while reducing the influence of the background. The generated attention map is then used to obtain diverse and adaptable anchor boxes using the guided anchoring method. The generated anchor boxes can match better with the scene and the objects, compared with the traditional proposal boxes. Finally, the modulated feature adaptation module is applied to transform the feature maps to adapt to the diverse anchor boxes. Comprehensive evaluations on the DIOR dataset demonstrate the superiority of the proposed method over the state-of-the-art methods, such as RetinaNet, FCOS and CornerNet. The mean average precision of the proposed method is 4.5% higher than the feature pyramid network. In addition, the ablation experiments are also implemented to further analyze the respective influence of different blocks on the performance improvement.


2021 ◽  
Vol 13 (18) ◽  
pp. 3776
Author(s):  
Linlin Zhu ◽  
Xun Geng ◽  
Zheng Li ◽  
Chun Liu

It is of great significance to apply the object detection methods to automatically detect boulders from planetary images and analyze their distribution. This contributes to the selection of candidate landing sites and the understanding of the geological processes. This paper improves the state-of-the-art object detection method of YOLOv5 with attention mechanism and designs a pyramid based approach to detect boulders from planetary images. A new feature fusion layer has been designed to capture more shallow features of the small boulders. The attention modules implemented by combining the convolutional block attention module (CBAM) and efficient channel attention network (ECA-Net) are also added into YOLOv5 to highlight the information that contribute to boulder detection. Based on the Pascal Visual Object Classes 2007 (VOC2007) dataset which is widely used for object detection evaluations and the boulder dataset that we constructed from the images of Bennu asteroid, the evaluation results have shown that the improvements have increased the performance of YOLOv5 by 3.4% in precision. With the improved YOLOv5 detection method, the pyramid based approach extracts several layers of images with different resolutions from the large planetary images and detects boulders of different scales from different layers. We have also applied the proposed approach to detect the boulders on Bennu asteroid. The distribution of the boulders on Bennu asteroid has been analyzed and presented.


Author(s):  
M. N. Favorskaya ◽  
L. C. Jain

Introduction:Saliency detection is a fundamental task of computer vision. Its ultimate aim is to localize the objects of interest that grab human visual attention with respect to the rest of the image. A great variety of saliency models based on different approaches was developed since 1990s. In recent years, the saliency detection has become one of actively studied topic in the theory of Convolutional Neural Network (CNN). Many original decisions using CNNs were proposed for salient object detection and, even, event detection.Purpose:A detailed survey of saliency detection methods in deep learning era allows to understand the current possibilities of CNN approach for visual analysis conducted by the human eyes’ tracking and digital image processing.Results:A survey reflects the recent advances in saliency detection using CNNs. Different models available in literature, such as static and dynamic 2D CNNs for salient object detection and 3D CNNs for salient event detection are discussed in the chronological order. It is worth noting that automatic salient event detection in durable videos became possible using the recently appeared 3D CNN combining with 2D CNN for salient audio detection. Also in this article, we have presented a short description of public image and video datasets with annotated salient objects or events, as well as the often used metrics for the results’ evaluation.Practical relevance:This survey is considered as a contribution in the study of rapidly developed deep learning methods with respect to the saliency detection in the images and videos.


Electronics ◽  
2021 ◽  
Vol 10 (4) ◽  
pp. 517
Author(s):  
Seong-heum Kim ◽  
Youngbae Hwang

Owing to recent advancements in deep learning methods and relevant databases, it is becoming increasingly easier to recognize 3D objects using only RGB images from single viewpoints. This study investigates the major breakthroughs and current progress in deep learning-based monocular 3D object detection. For relatively low-cost data acquisition systems without depth sensors or cameras at multiple viewpoints, we first consider existing databases with 2D RGB photos and their relevant attributes. Based on this simple sensor modality for practical applications, deep learning-based monocular 3D object detection methods that overcome significant research challenges are categorized and summarized. We present the key concepts and detailed descriptions of representative single-stage and multiple-stage detection solutions. In addition, we discuss the effectiveness of the detection models on their baseline benchmarks. Finally, we explore several directions for future research on monocular 3D object detection.


Sign in / Sign up

Export Citation Format

Share Document