Flower Species Recognition System Combining Object Detection and Attention Mechanism

Author(s):  
Wei Qin ◽  
Xue Cui ◽  
Chang-An Yuan ◽  
Xiao Qin ◽  
Li Shang ◽  
...  
Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Xianghua Ma ◽  
Zhenkun Yang ◽  
Shining Chen

For unmanned aerial vehicle (UAV), object detection at different scales is an important component for the visual recognition. Recent advances in convolutional neural networks (CNNs) have demonstrated that attention mechanism remarkably enhances multiscale representation of CNNs. However, most existing multiscale feature representation methods simply employ several attention blocks in the attention mechanism to adaptively recalibrate the feature response, which overlooks the context information at a multiscale level. To solve this problem, a multiscale feature filtering network (MFFNet) is proposed in this paper for image recognition system in the UAV. A novel building block, namely, multiscale feature filtering (MFF) module, is proposed for ResNet-like backbones and it allows feature-selective learning for multiscale context information across multiparallel branches. These branches employ multiple atrous convolutions at different scales, respectively, and further adaptively generate channel-wise feature responses by emphasizing channel-wise dependencies. Experimental results on CIFAR100 and Tiny ImageNet datasets reflect that the MFFNet achieves very competitive results in comparison with previous baseline models. Further ablation experiments verify that the MFFNet can achieve consistent performance gains in image classification and object detection tasks.


Symmetry ◽  
2020 ◽  
Vol 12 (10) ◽  
pp. 1718
Author(s):  
Chien-Hsing Chou ◽  
Yu-Sheng Su ◽  
Che-Ju Hsu ◽  
Kong-Chang Lee ◽  
Ping-Hsuan Han

In this study, we designed a four-dimensional (4D) audiovisual entertainment system called Sense. This system comprises a scene recognition system and hardware modules that provide haptic sensations for users when they watch movies and animations at home. In the scene recognition system, we used Google Cloud Vision to detect common scene elements in a video, such as fire, explosions, wind, and rain, and further determine whether the scene depicts hot weather, rain, or snow. Additionally, for animated videos, we applied deep learning with a single shot multibox detector to detect whether the animated video contained scenes of fire-related objects. The hardware module was designed to provide six types of haptic sensations set as line-symmetry to provide a better user experience. After the system considers the results of object detection via the scene recognition system, the system generates corresponding haptic sensations. The system integrates deep learning, auditory signals, and haptic sensations to provide an enhanced viewing experience.


2021 ◽  
Author(s):  
Tainian Song ◽  
Weiwei Qin ◽  
Zhuo Liang ◽  
Qingqiang Qin ◽  
Gang Liu

2019 ◽  
Vol 9 (9) ◽  
pp. 1829 ◽  
Author(s):  
Jie Jiang ◽  
Hui Xu ◽  
Shichang Zhang ◽  
Yujie Fang

This study proposes a multiheaded object detection algorithm referred to as MANet. The main purpose of the study is to integrate feature layers of different scales based on the attention mechanism and to enhance contextual connections. To achieve this, we first replaced the feed-forward base network of the single-shot detector with the ResNet–101 (inspired by the Deconvolutional Single-Shot Detector) and then applied linear interpolation and the attention mechanism. The information of the feature layers at different scales was fused to improve the accuracy of target detection. The primary contributions of this study are the propositions of (a) a fusion attention mechanism, and (b) a multiheaded attention fusion method. Our final MANet detector model effectively unifies the feature information among the feature layers at different scales, thus enabling it to detect objects with different sizes and with higher precision. We used the 512 × 512 input MANet (the backbone is ResNet–101) to obtain a mean accuracy of 82.7% based on the PASCAL visual object class 2007 test. These results demonstrated that our proposed method yielded better accuracy than those provided by the conventional Single-shot detector (SSD) and other advanced detectors.


Sign in / Sign up

Export Citation Format

Share Document