Object Detection Algorithm Based on Multiheaded Attention

This study proposes a multiheaded object detection algorithm referred to as MANet. The main purpose of the study is to integrate feature layers of different scales based on the attention mechanism and to enhance contextual connections. To achieve this, we first replaced the feed-forward base network of the single-shot detector with the ResNet–101 (inspired by the Deconvolutional Single-Shot Detector) and then applied linear interpolation and the attention mechanism. The information of the feature layers at different scales was fused to improve the accuracy of target detection. The primary contributions of this study are the propositions of (a) a fusion attention mechanism, and (b) a multiheaded attention fusion method. Our final MANet detector model effectively unifies the feature information among the feature layers at different scales, thus enabling it to detect objects with different sizes and with higher precision. We used the 512 × 512 input MANet (the backbone is ResNet–101) to obtain a mean accuracy of 82.7% based on the PASCAL visual object class 2007 test. These results demonstrated that our proposed method yielded better accuracy than those provided by the conventional Single-shot detector (SSD) and other advanced detectors.

Download Full-text

YOLOv4 Object Detection Algorithm with Efficient Channel Attention Mechanism

2020 5th International Conference on Mechanical, Control and Computer Engineering (ICMCCE) ◽

10.1109/icmcce51767.2020.00387 ◽

2020 ◽

Author(s):

Cui Gao ◽

Qiang Cai ◽

Shaofeng Ming

Keyword(s):

Object Detection ◽

Detection Algorithm ◽

Attention Mechanism

Download Full-text

An Object Detection Algorithm Based on Attention Mechanism and Lightweight Network (AMLN)

Proceedings of the 2020 the 4th International Conference on Innovation in Artificial Intelligence ◽

10.1145/3390557.3394297 ◽

2020 ◽

Author(s):

Xuemei Yuan ◽

Hanming Huang ◽

Zhengfeng Jiang ◽

Simin Xue

Keyword(s):

Object Detection ◽

Detection Algorithm ◽

Attention Mechanism

Download Full-text

Research on CNN for Anti-missile Object Detection Algorithm Based on Improved Attention Mechanism

10.23919/ccc52363.2021.9550491 ◽

2021 ◽

Author(s):

Tainian Song ◽

Weiwei Qin ◽

Zhuo Liang ◽

Qingqiang Qin ◽

Gang Liu

Keyword(s):

Object Detection ◽

Detection Algorithm ◽

Attention Mechanism

Download Full-text

Revisiting knowledge distillation for light-weight visual object detection

Transactions of the Institute of Measurement and Control ◽

10.1177/01423312211022877 ◽

2021 ◽

Vol 43 (13) ◽

pp. 2888-2898

Author(s):

Tianze Gao ◽

Yunfeng Gao ◽

Yu Li ◽

Peiyuan Qin

Keyword(s):

Object Detection ◽

Essential Element ◽

Detection Algorithm ◽

Positive Sample ◽

Detection Methods ◽

Visual Object ◽

Light Weight ◽

Model Compression ◽

Novel Approach ◽

Knowledge Distillation

An essential element for intelligent perception in mechatronic and robotic systems (M&RS) is the visual object detection algorithm. With the ever-increasing advance of artificial neural networks (ANN), researchers have proposed numerous ANN-based visual object detection methods that have proven to be effective. However, networks with cumbersome structures do not befit the real-time scenarios in M&RS, necessitating the techniques of model compression. In the paper, a novel approach to training light-weight visual object detection networks is developed by revisiting knowledge distillation. Traditional knowledge distillation methods are oriented towards image classification is not compatible with object detection. Therefore, a variant of knowledge distillation is developed and adapted to a state-of-the-art keypoint-based visual detection method. Two strategies named as positive sample retaining and early distribution softening are employed to yield a natural adaption. The mutual consistency between teacher model and student model is further promoted through a hint-based distillation. By extensive controlled experiments, the proposed method is testified to be effective in enhancing the light-weight network’s performance by a large margin.

Download Full-text

GC-YOLOv3: You Only Look Once with Global Context Block

Electronics ◽

10.3390/electronics9081235 ◽

2020 ◽

Vol 9 (8) ◽

pp. 1235

Author(s):

Yang Yang ◽

Hongmin Deng

Keyword(s):

Object Detection ◽

Irrelevant Information ◽

Detection Algorithm ◽

Visual Object ◽

Detection Accuracy ◽

Feature Maps ◽

Average Precision ◽

Global Context ◽

Pascal Voc ◽

Feature Pyramid

In order to make the classification and regression of single-stage detectors more accurate, an object detection algorithm named Global Context You-Only-Look-Once v3 (GC-YOLOv3) is proposed based on the You-Only-Look-Once (YOLO) in this paper. Firstly, a better cascading model with learnable semantic fusion between a feature extraction network and a feature pyramid network is designed to improve detection accuracy using a global context block. Secondly, the information to be retained is screened by combining three different scaling feature maps together. Finally, a global self-attention mechanism is used to highlight the useful information of feature maps while suppressing irrelevant information. Experiments show that our GC-YOLOv3 reaches a maximum of 55.5 object detection mean Average Precision (mAP)@0.5 on Common Objects in Context (COCO) 2017 test-dev and that the mAP is 5.1% higher than that of the YOLOv3 algorithm on Pascal Visual Object Classes (PASCAL VOC) 2007 test set. Therefore, experiments indicate that the proposed GC-YOLOv3 model exhibits optimal performance on the PASCAL VOC and COCO datasets.

Download Full-text

Water surface object detection using panoramic vision based on improved single-shot multibox detector

EURASIP Journal on Advances in Signal Processing ◽

10.1186/s13634-021-00831-6 ◽

2021 ◽

Vol 2021 (1) ◽

Author(s):

Aofeng Li ◽

Xufang Zhu ◽

Shuo He ◽

Jiawei Xia

Keyword(s):

Object Detection ◽

Water Surface ◽

Detection Algorithm ◽

Single Shot ◽

Panoramic Vision ◽

Feature Pyramid ◽

The Mean ◽

Detection Effect ◽

Surface Object ◽

Improved Algorithm

AbstractIn view of the deficiencies in traditional visual water surface object detection, such as the existence of non-detection zones, failure to acquire global information, and deficiencies in a single-shot multibox detector (SSD) object detection algorithm such as remote detection and low detection precision of small objects, this study proposes a water surface object detection algorithm from panoramic vision based on an improved SSD. We reconstruct the backbone network for the SSD algorithm, replace VVG16 with a ResNet-50 network, and add five layers of feature extraction. More abundant semantic information of the shallow feature graph is obtained through a feature pyramid network structure with deconvolution. An experiment is conducted by building a water surface object dataset. Results showed the mean Average Precision (mAP) of the improved algorithm are increased by 4.03%, compared with the existing SSD detecting Algorithm. Improved algorithm can effectively improve the overall detection precision of water surface objects and enhance the detection effect of remote objects.

Download Full-text

Estimation of Human Position and Velocity in Collaborative Robot System Using Visual Object Detection Algorithm and Kalman Filter

2020 17th International Conference on Ubiquitous Robots (UR) ◽

10.1109/ur49135.2020.9144888 ◽

2020 ◽

Author(s):

Jiwoong Lim ◽

Sungsoo Rhim

Keyword(s):

Kalman Filter ◽

Object Detection ◽

Detection Algorithm ◽

Visual Object ◽

Robot System ◽

Collaborative Robot ◽

Human Position

Download Full-text

Robust Cherry Tomatoes Detection Algorithm in Greenhouse Scene Based on SSD

Agriculture ◽

10.3390/agriculture10050160 ◽

2020 ◽

Vol 10 (5) ◽

pp. 160

Author(s):

Ting Yuan ◽

Lin Lv ◽

Fan Zhang ◽

Jun Fu ◽

Jin Gao ◽

...

Keyword(s):

Deep Learning ◽

Detection Algorithm ◽

Input Image ◽

Detection Methods ◽

Single Shot ◽

Image Size ◽

Growth Difference ◽

Input Size ◽

Base Network ◽

Cherry Tomatoes

The detection of cherry tomatoes in greenhouse scene is of great significance for robotic harvesting. This paper states a method based on deep learning for cherry tomatoes detection to reduce the influence of illumination, growth difference, and occlusion. In view of such greenhouse operating environment and accuracy of deep learning, Single Shot multi-box Detector (SSD) was selected because of its excellent anti-interference ability and self-taught from datasets. The first step is to build datasets containing various conditions in greenhouse. According to the characteristics of cherry tomatoes, the image samples with illumination change, images rotation and noise enhancement were used to expand the datasets. Then training datasets were used to train and construct network model. To study the effect of base network and the input size of networks, one contrast experiment was designed on different base networks of VGG16, MobileNet, Inception V2 networks, and the other contrast experiment was conducted on changing the network input image size of 300 pixels by 300 pixels, 512 pixels by 512 pixels. Through the analysis of the experimental results, it is found that the Inception V2 network is the best base network with the average precision of 98.85% in greenhouse environment. Compared with other detection methods, this method shows substantial improvement in cherry tomatoes detection.

Download Full-text

SSD7-FFAM: A Real-Time Object Detection Network Friendly to Embedded Devices from Scratch

Applied Sciences ◽

10.3390/app11031096 ◽

2021 ◽

Vol 11 (3) ◽

pp. 1096

Author(s):

Qing Li ◽

Yingcheng Lin ◽

Wei He

Keyword(s):

Object Detection ◽

Real Time ◽

Large Scale ◽

Feature Fusion ◽

Contextual Information ◽

Attention Mechanism ◽

Detection Accuracy ◽

Single Shot ◽

Feature Maps ◽

Embedded Devices

The high requirements for computing and memory are the biggest challenges in deploying existing object detection networks to embedded devices. Living lightweight object detectors directly use lightweight neural network architectures such as MobileNet or ShuffleNet pre-trained on large-scale classification datasets, which results in poor network structure flexibility and is not suitable for some specific scenarios. In this paper, we propose a lightweight object detection network Single-Shot MultiBox Detector (SSD)7-Feature Fusion and Attention Mechanism (FFAM), which saves storage space and reduces the amount of calculation by reducing the number of convolutional layers. We offer a novel Feature Fusion and Attention Mechanism (FFAM) method to improve detection accuracy. Firstly, the FFAM method fuses high-level semantic information-rich feature maps with low-level feature maps to improve small objects’ detection accuracy. The lightweight attention mechanism cascaded by channels and spatial attention modules is employed to enhance the target’s contextual information and guide the network to focus on its easy-to-recognize features. The SSD7-FFAM achieves 83.7% mean Average Precision (mAP), 1.66 MB parameters, and 0.033 s average running time on the NWPU VHR-10 dataset. The results indicate that the proposed SSD7-FFAM is more suitable for deployment to embedded devices for real-time object detection.

Download Full-text

CXR-RefineDet: Single-Shot Refinement Neural Network for Chest X-Ray Radiograph Based on Multiple Lesions Detection

Journal of Healthcare Engineering ◽

10.1155/2022/4182191 ◽

2022 ◽

Vol 2022 ◽

pp. 1-11

Author(s):

Cong Lin ◽

Yongbin Zheng ◽

Xiuchun Xiao ◽

Jialun Lin

Keyword(s):

Neural Network ◽

Object Detection ◽

Disease Diagnosis ◽

Detection Algorithm ◽

Detection Accuracy ◽

Single Shot ◽

Detection Model ◽

Network Output ◽

Artificial Intelligence Technology ◽

Chest X Ray

The workload of radiologists has dramatically increased in the context of the COVID-19 pandemic, causing misdiagnosis and missed diagnosis of diseases. The use of artificial intelligence technology can assist doctors in locating and identifying lesions in medical images. In order to improve the accuracy of disease diagnosis in medical imaging, we propose a lung disease detection neural network that is superior to the current mainstream object detection model in this paper. By combining the advantages of RepVGG block and Resblock in information fusion and information extraction, we design a backbone RRNet with few parameters and strong feature extraction capabilities. After that, we propose a structure called Information Reuse, which can solve the problem of low utilization of the original network output features by connecting the normalized features back to the network. Combining the network of RRNet and the improved RefineDet, we propose the overall network which was called CXR-RefineDet. Through a large number of experiments on the largest public lung chest radiograph detection dataset VinDr-CXR, it is found that the detection accuracy and inference speed of CXR-RefineDet have reached 0.1686 mAP and 6.8 fps, respectively, which is better than the two-stage object detection algorithm using a strong backbone like ResNet-50 and ResNet-101. In addition, the fast reasoning speed of CXR-RefineDet also provides the possibility for the actual implementation of the computer-aided diagnosis system.

Download Full-text