One Spatio-Temporal Sharpening Attention Mechanism for Light-Weight YOLO Models Based on Sharpening Spatial Attention

Attention mechanisms have demonstrated great potential in improving the performance of deep convolutional neural networks (CNNs). However, many existing methods dedicate to developing channel or spatial attention modules for CNNs with lots of parameters, and complex attention modules inevitably affect the performance of CNNs. During our experiments of embedding Convolutional Block Attention Module (CBAM) in light-weight model YOLOv5s, CBAM does influence the speed and increase model complexity while reduce the average precision, but Squeeze-and-Excitation (SE) has a positive impact in the model as part of CBAM. To replace the spatial attention module in CBAM and offer a suitable scheme of channel and spatial attention modules, this paper proposes one Spatio-temporal Sharpening Attention Mechanism (SSAM), which sequentially infers intermediate maps along channel attention module and Sharpening Spatial Attention (SSA) module. By introducing sharpening filter in spatial attention module, we propose SSA module with low complexity. We try to find a scheme to combine our SSA module with SE module or Efficient Channel Attention (ECA) module and show best improvement in models such as YOLOv5s and YOLOv3-tiny. Therefore, we perform various replacement experiments and offer one best scheme that is to embed channel attention modules in backbone and neck of the model and integrate SSAM into YOLO head. We verify the positive effect of our SSAM on two general object detection datasets VOC2012 and MS COCO2017. One for obtaining a suitable scheme and the other for proving the versatility of our method in complex scenes. Experimental results on the two datasets show obvious promotion in terms of average precision and detection performance, which demonstrates the usefulness of our SSAM in light-weight YOLO models. Furthermore, visualization results also show the advantage of enhancing positioning ability with our SSAM.

Download Full-text

PCAN—Part-Based Context Attention Network for Thermal Power Plant Detection in Remote Sensing Imagery

Remote Sensing ◽

10.3390/rs13071243 ◽

2021 ◽

Vol 13 (7) ◽

pp. 1243

Author(s):

Wenxin Yin ◽

Wenhui Diao ◽

Peijin Wang ◽

Xin Gao ◽

Ya Li ◽

...

Keyword(s):

Remote Sensing ◽

Power Plants ◽

State Of The Art ◽

Thermal Power ◽

Image Interpretation ◽

Remote Sensing Image ◽

Thermal Power Plants ◽

Average Precision ◽

Deep Convolutional Neural Networks ◽

Multi Scale

The detection of Thermal Power Plants (TPPs) is a meaningful task for remote sensing image interpretation. It is a challenging task, because as facility objects TPPs are composed of various distinctive and irregular components. In this paper, we propose a novel end-to-end detection framework for TPPs based on deep convolutional neural networks. Specifically, based on the RetinaNet one-stage detector, a context attention multi-scale feature extraction network is proposed to fuse global spatial attention to strengthen the ability in representing irregular objects. In addition, we design a part-based attention module to adapt to TPPs containing distinctive components. Experiments show that the proposed method outperforms the state-of-the-art methods and can achieve 68.15% mean average precision.

Download Full-text

Spatio-Temporal 3D Action Recognition with Hierarchical Self-Attention Mechanism

2021 26th International Computer Conference, Computer Society of Iran (CSICC) ◽

10.1109/csicc52343.2021.9420631 ◽

2021 ◽

Author(s):

Soheil Araei ◽

Ali Nadian-Ghomsheh

Keyword(s):

Action Recognition ◽

Attention Mechanism ◽

Spatio Temporal

Download Full-text

MwoA auxiliary diagnosis via RSN-based 3D deep multiple instance learning with spatial attention mechanism

2020 11th International Conference on Awareness Science and Technology (iCAST) ◽

10.1109/icast51195.2020.9319486 ◽

2020 ◽

Author(s):

Xiang Li ◽

Benzheng Wei ◽

Tianyang Li ◽

Na Zhang

Keyword(s):

Spatial Attention ◽

Multiple Instance Learning ◽

Attention Mechanism

Download Full-text

Lightweight pyramid network with spatial attention mechanism for accurate retinal vessel segmentation

International Journal of Computer Assisted Radiology and Surgery ◽

10.1007/s11548-021-02344-x ◽

2021 ◽

Vol 16 (4) ◽

pp. 673-682

Author(s):

Tengfei Tan ◽

Zhilun Wang ◽

Hongwei Du ◽

Jinzhang Xu ◽

Bensheng Qiu

Keyword(s):

Spatial Attention ◽

Retinal Vessel ◽

Vessel Segmentation ◽

Attention Mechanism ◽

Retinal Vessel Segmentation

Download Full-text

Light-weight UAV object tracking network based on strategy gradient and attention mechanism

Knowledge-Based Systems ◽

10.1016/j.knosys.2021.107071 ◽

2021 ◽

Vol 224 ◽

pp. 107071

Author(s):

Xia Hua ◽

Xinqing Wang ◽

Ting Rui ◽

Faming Shao ◽

Dong Wang

Keyword(s):

Object Tracking ◽

Attention Mechanism ◽

Light Weight

Download Full-text

Multi-Head Spatio-Temporal Attention Mechanism for Urban Anomaly Event Prediction

Proceedings of the ACM on Interactive Mobile Wearable and Ubiquitous Technologies ◽

10.1145/3478099 ◽

2021 ◽

Vol 5 (3) ◽

pp. 1-21

Author(s):

Huiqun Huang ◽

Xi Yang ◽

Suining He

Keyword(s):

New York ◽

Experimental Studies ◽

Attention Mechanism ◽

City Management ◽

Temporal Attention ◽

Spatial Correlations ◽

Time Step ◽

Event Prediction ◽

Spatio Temporal ◽

Prediction Approach

Timely forecasting the urban anomaly events in advance is of great importance to the city management and planning. However, anomaly event prediction is highly challenging due to the sparseness of data, geographic heterogeneity (e.g., complex spatial correlation, skewed spatial distribution of anomaly events and crowd flows), and the dynamic temporal dependencies. In this study, we propose M-STAP, a novel Multi-head Spatio-Temporal Attention Prediction approach to address the problem of multi-region urban anomaly event prediction. Specifically, M-STAP considers the problem from three main aspects: (1) extracting the spatial characteristics of the anomaly events in different regions, and the spatial correlations between anomaly events and crowd flows; (2) modeling the impacts of crowd flow dynamic of the most relevant regions in each time step on the anomaly events; and (3) employing attention mechanism to analyze the varying impacts of the historical anomaly events on the predicted data. We have conducted extensive experimental studies on the crowd flows and anomaly events data of New York City, Melbourne and Chicago. Our proposed model shows higher accuracy (41.91% improvement on average) in predicting multi-region anomaly events compared with the state-of-the-arts.

Download Full-text

Lw-TISNet: Light-Weight Convolutional Neural Network Incorporating Attention Mechanism and Multiple Supervision Strategy for Tongue Image Segmentation

Sensing and Imaging ◽

10.1007/s11220-021-00375-x ◽

2022 ◽

Vol 23 (1) ◽

Author(s):

Xiaodong Huang ◽

Li Zhuo ◽

Hui Zhang ◽

Xiaoguang Li ◽

Jing Zhang

Keyword(s):

Neural Network ◽

Image Segmentation ◽

Convolutional Neural Network ◽

Attention Mechanism ◽

Light Weight

Download Full-text

Efficient Low-Complexity Digital Predistortion for Power Amplifier Linearization

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v6i3.pp1096-1105 ◽

2016 ◽

Vol 6 (3) ◽

pp. 1096

Author(s):

Siba Monther Yousif ◽

Roslina M. Sidek ◽

Anwer Sabah Mekki ◽

Nasri Sulaiman ◽

Pooria Varahram

Keyword(s):

Power Amplifier ◽

High Performance ◽

Low Complexity ◽

Memory Effects ◽

Model Complexity ◽

Model Parameters ◽

Digital Predistortion ◽

Nonlinear Memory ◽

Proposed Model ◽

Code Division Multiple

<span lang="EN-US">In this paper, a low-complexity model is proposed for linearizing power amplifiers with memory effects using the digital predistortion (DPD) technique. In the proposed model, the linear, low-order nonlinear and high-order nonlinear memory effects are computed separately to provide flexibility in controlling the model parameters so that both high performance and low model complexity can be achieved. The performance of the proposed model is assessed based on experimental measurements of a commercial class AB power amplifier by applying a single-carrier wideband code division multiple access (WCDMA) signal. The linearity performance and the model complexity of the proposed model are compared with the memory polynomial (MP) model and the DPD with single-feedback model. The experimental results show that the proposed model outperforms the latter model by 5 dB in terms of adjacent channel leakage power ratio (ACLR) with comparable complexity. Compared to MP model, the proposed model shows improved ACLR performance by 10.8 dB with a reduction in the complexity by 17% in terms of number of floating-point operations (FLOPs) and 18% in terms of number of model coefficients.</span>

Download Full-text